1. 11 Jun, 2019 1 commit
    • Neil Horman's avatar
      Fix memory leak in sctp_process_init · 2cdb66cc
      Neil Horman authored
      [ Upstream commit 0a8dd9f67cd0da7dc284f48b032ce00db1a68791 ]
      
      syzbot found the following leak in sctp_process_init
      BUG: memory leak
      unreferenced object 0xffff88810ef68400 (size 1024):
        comm "syz-executor273", pid 7046, jiffies 4294945598 (age 28.770s)
        hex dump (first 32 bytes):
          1d de 28 8d de 0b 1b e3 b5 c2 f9 68 fd 1a 97 25  ..(........h...%
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000a02cebbd>] kmemleak_alloc_recursive include/linux/kmemleak.h:55
      [inline]
          [<00000000a02cebbd>] slab_post_alloc_hook mm/slab.h:439 [inline]
          [<00000000a02cebbd>] slab_alloc mm/slab.c:3326 [inline]
          [<00000000a02cebbd>] __do_kmalloc mm/slab.c:3658 [inline]
          [<00000000a02cebbd>] __kmalloc_track_caller+0x15d/0x2c0 mm/slab.c:3675
          [<000000009e6245e6>] kmemdup+0x27/0x60 mm/util.c:119
          [<00000000dfdc5d2d>] kmemdup include/linux/string.h:432 [inline]
          [<00000000dfdc5d2d>] sctp_process_init+0xa7e/0xc20
      net/sctp/sm_make_chunk.c:2437
          [<00000000b58b62f8>] sctp_cmd_process_init net/sctp/sm_sideeffect.c:682
      [inline]
          [<00000000b58b62f8>] sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1384
      [inline]
          [<00000000b58b62f8>] sctp_side_effects net/sctp/sm_sideeffect.c:1194
      [inline]
          [<00000000b58b62f8>] sctp_do_sm+0xbdc/0x1d60 net/sctp/sm_sideeffect.c:1165
          [<0000000044e11f96>] sctp_assoc_bh_rcv+0x13c/0x200
      net/sctp/associola.c:1074
          [<00000000ec43804d>] sctp_inq_push+0x7f/0xb0 net/sctp/inqueue.c:95
          [<00000000726aa954>] sctp_backlog_rcv+0x5e/0x2a0 net/sctp/input.c:354
          [<00000000d9e249a8>] sk_backlog_rcv include/net/sock.h:950 [inline]
          [<00000000d9e249a8>] __release_sock+0xab/0x110 net/core/sock.c:2418
          [<00000000acae44fa>] release_sock+0x37/0xd0 net/core/sock.c:2934
          [<00000000963cc9ae>] sctp_sendmsg+0x2c0/0x990 net/sctp/socket.c:2122
          [<00000000a7fc7565>] inet_sendmsg+0x64/0x120 net/ipv4/af_inet.c:802
          [<00000000b732cbd3>] sock_sendmsg_nosec net/socket.c:652 [inline]
          [<00000000b732cbd3>] sock_sendmsg+0x54/0x70 net/socket.c:671
          [<00000000274c57ab>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2292
          [<000000008252aedb>] __sys_sendmsg+0x80/0xf0 net/socket.c:2330
          [<00000000f7bf23d1>] __do_sys_sendmsg net/socket.c:2339 [inline]
          [<00000000f7bf23d1>] __se_sys_sendmsg net/socket.c:2337 [inline]
          [<00000000f7bf23d1>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2337
          [<00000000a8b4131f>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:3
      
      The problem was that the peer.cookie value points to an skb allocated
      area on the first pass through this function, at which point it is
      overwritten with a heap allocated value, but in certain cases, where a
      COOKIE_ECHO chunk is included in the packet, a second pass through
      sctp_process_init is made, where the cookie value is re-allocated,
      leaking the first allocation.
      
      Fix is to always allocate the cookie value, and free it when we are done
      using it.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reported-by: syzbot+f7e9153b037eac9b1df8@syzkaller.appspotmail.com
      CC: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: netdev@vger.kernel.org
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2cdb66cc
  2. 08 May, 2019 1 commit
    • Xin Long's avatar
      sctp: avoid running the sctp state machine recursively · a5d00345
      Xin Long authored
      [ Upstream commit fbd019737d71e405f86549fd738f81e2ff3dd073 ]
      
      Ying triggered a call trace when doing an asconf testing:
      
        BUG: scheduling while atomic: swapper/12/0/0x10000100
        Call Trace:
         <IRQ>  [<ffffffffa4375904>] dump_stack+0x19/0x1b
         [<ffffffffa436fcaf>] __schedule_bug+0x64/0x72
         [<ffffffffa437b93a>] __schedule+0x9ba/0xa00
         [<ffffffffa3cd5326>] __cond_resched+0x26/0x30
         [<ffffffffa437bc4a>] _cond_resched+0x3a/0x50
         [<ffffffffa3e22be8>] kmem_cache_alloc_node+0x38/0x200
         [<ffffffffa423512d>] __alloc_skb+0x5d/0x2d0
         [<ffffffffc0995320>] sctp_packet_transmit+0x610/0xa20 [sctp]
         [<ffffffffc098510e>] sctp_outq_flush+0x2ce/0xc00 [sctp]
         [<ffffffffc098646c>] sctp_outq_uncork+0x1c/0x20 [sctp]
         [<ffffffffc0977338>] sctp_cmd_interpreter.isra.22+0xc8/0x1460 [sctp]
         [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp]
         [<ffffffffc099443d>] sctp_primitive_ASCONF+0x3d/0x50 [sctp]
         [<ffffffffc0977384>] sctp_cmd_interpreter.isra.22+0x114/0x1460 [sctp]
         [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp]
         [<ffffffffc097b3a4>] sctp_assoc_bh_rcv+0xf4/0x1b0 [sctp]
         [<ffffffffc09840f1>] sctp_inq_push+0x51/0x70 [sctp]
         [<ffffffffc099732b>] sctp_rcv+0xa8b/0xbd0 [sctp]
      
      As it shows, the first sctp_do_sm() running under atomic context (NET_RX
      softirq) invoked sctp_primitive_ASCONF() that uses GFP_KERNEL flag later,
      and this flag is supposed to be used in non-atomic context only. Besides,
      sctp_do_sm() was called recursively, which is not expected.
      
      Vlad tried to fix this recursive call in Commit c0786693 ("sctp: Fix
      oops when sending queued ASCONF chunks") by introducing a new command
      SCTP_CMD_SEND_NEXT_ASCONF. But it didn't work as this command is still
      used in the first sctp_do_sm() call, and sctp_primitive_ASCONF() will
      be called in this command again.
      
      To avoid calling sctp_do_sm() recursively, we send the next queued ASCONF
      not by sctp_primitive_ASCONF(), but by sctp_sf_do_prm_asconf() in the 1st
      sctp_do_sm() directly.
      Reported-by: default avatarYing Xu <yinxu@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a5d00345
  3. 29 Oct, 2017 2 commits
  4. 11 Aug, 2017 5 commits
  5. 07 Aug, 2017 4 commits
  6. 03 Aug, 2017 1 commit
  7. 01 Jul, 2017 2 commits
  8. 20 Jun, 2017 2 commits
  9. 20 Feb, 2017 1 commit
  10. 07 Feb, 2017 1 commit
  11. 18 Jan, 2017 1 commit
    • Xin Long's avatar
      sctp: add stream reconf timer · 7b9438de
      Xin Long authored
      This patch is to add a per transport timer based on sctp timer frame
      for stream reconf chunk retransmission. It would start after sending
      a reconf request chunk, and stop after receiving the response chunk.
      
      If the timer expires, besides retransmitting the reconf request chunk,
      it would also do the same thing with data RTO timer. like to increase
      the appropriate error counts, and perform threshold management, possibly
      destroying the asoc if sctp retransmission thresholds are exceeded, just
      as section 5.1.1 describes.
      
      This patch is also to add asoc strreset_chunk, it is used to save the
      reconf request chunk, so that it can be retransmitted, and to check if
      the response is really for this request by comparing the information
      inside with the response chunk as well.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b9438de
  12. 19 Sep, 2016 2 commits
    • Xin Long's avatar
      sctp: make sctp_outq_flush/tail/uncork return void · 83dbc3d4
      Xin Long authored
      sctp_outq_flush return value is meaningless now, this patch is
      to make sctp_outq_flush return void, as well as sctp_outq_fail
      and sctp_outq_uncork.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83dbc3d4
    • Xin Long's avatar
      sctp: do not return the transmit err back to sctp_sendmsg · 66388f2c
      Xin Long authored
      Once a chunk is enqueued successfully, sctp queues can take care of it.
      Even if it is failed to transmit (like because of nomem), it should be
      put into retransmit queue.
      
      If sctp report this error to users, it confuses them, they may resend
      that msg, but actually in kernel sctp stack is in charge of retransmit
      it already.
      
      Besides, this error probably is not from the failure of transmitting
      current msg, but transmitting or retransmitting another msg's chunks,
      as sctp_outq_flush just tries to send out all transports' chunks.
      
      This patch is to make sctp_cmd_send_msg return avoid, and not return the
      transmit err back to sctp_sendmsg
      
      Fixes: 8b570dc9 ("sctp: only drop the reference on the datamsg after sending a msg")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      66388f2c
  13. 11 Jun, 2016 1 commit
  14. 02 May, 2016 1 commit
  15. 14 Apr, 2016 1 commit
  16. 11 Apr, 2016 1 commit
    • Marcelo Ricardo Leitner's avatar
      sctp: avoid refreshing heartbeat timer too often · ba6f5e33
      Marcelo Ricardo Leitner authored
      Currently on high rate SCTP streams the heartbeat timer refresh can
      consume quite a lot of resources as timer updates are costly and it
      contains a random factor, which a) is also costly and b) invalidates
      mod_timer() optimization for not editing a timer to the same value.
      It may even cause the timer to be slightly advanced, for no good reason.
      
      As suggested by David Laight this patch now removes this timer update
      from hot path by leaving the timer on and re-evaluating upon its
      expiration if the heartbeat is still needed or not, similarly to what is
      done for TCP. If it's not needed anymore the timer is re-scheduled to
      the new timeout, considering the time already elapsed.
      
      For this, we now record the last tx timestamp per transport, updated in
      the same spots as hb timer was restarted on tx. Also split up
      sctp_transport_reset_timers into sctp_transport_reset_t3_rtx and
      sctp_transport_reset_hb_timer, so we can re-arm T3 without re-arming the
      heartbeat one.
      
      On loopback with MTU of 65535 and data chunks with 1636, so that we
      have a considerable amount of chunks without stressing system calls,
      netperf -t SCTP_STREAM -l 30, perf looked like this before:
      
      Samples: 103K of event 'cpu-clock', Event count (approx.): 25833000000
        Overhead  Command  Shared Object      Symbol
      +    6,15%  netperf  [kernel.vmlinux]   [k] copy_user_enhanced_fast_string
      -    5,43%  netperf  [kernel.vmlinux]   [k] _raw_write_unlock_irqrestore
         - _raw_write_unlock_irqrestore
            - 96,54% _raw_spin_unlock_irqrestore
               - 36,14% mod_timer
                  + 97,24% sctp_transport_reset_timers
                  + 2,76% sctp_do_sm
               + 33,65% __wake_up_sync_key
               + 28,77% sctp_ulpq_tail_event
               + 1,40% del_timer
            - 1,84% mod_timer
               + 99,03% sctp_transport_reset_timers
               + 0,97% sctp_do_sm
            + 1,50% sctp_ulpq_tail_event
      
      And after this patch, now with netperf -l 60:
      
      Samples: 230K of event 'cpu-clock', Event count (approx.): 57707250000
        Overhead  Command  Shared Object      Symbol
      +    5,65%  netperf  [kernel.vmlinux]   [k] memcpy_erms
      +    5,59%  netperf  [kernel.vmlinux]   [k] copy_user_enhanced_fast_string
      -    5,05%  netperf  [kernel.vmlinux]   [k] _raw_spin_unlock_irqrestore
         - _raw_spin_unlock_irqrestore
            + 49,89% __wake_up_sync_key
            + 45,68% sctp_ulpq_tail_event
            - 2,85% mod_timer
               + 76,51% sctp_transport_reset_t3_rtx
               + 23,49% sctp_do_sm
            + 1,55% del_timer
      +    2,50%  netperf  [sctp]             [k] sctp_datamsg_from_user
      +    2,26%  netperf  [sctp]             [k] sctp_sendmsg
      
      Throughput-wise, from 6800mbps without the patch to 7050mbps with it,
      ~3.7%.
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba6f5e33
  17. 20 Mar, 2016 1 commit
  18. 14 Mar, 2016 1 commit
    • Marcelo Ricardo Leitner's avatar
      sctp: allow sctp_transmit_packet and others to use gfp · cea8768f
      Marcelo Ricardo Leitner authored
      Currently sctp_sendmsg() triggers some calls that will allocate memory
      with GFP_ATOMIC even when not necessary. In the case of
      sctp_packet_transmit it will allocate a linear skb that will be used to
      construct the packet and this may cause sends to fail due to ENOMEM more
      often than anticipated specially with big MTUs.
      
      This patch thus allows it to inherit gfp flags from upper calls so that
      it can use GFP_KERNEL if it was triggered by a sctp_sendmsg call or
      similar. All others, like retransmits or flushes started from BH, are
      still allocated using GFP_ATOMIC.
      
      In netperf tests this didn't result in any performance drawbacks when
      memory is not too fragmented and made it trigger ENOMEM way less often.
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cea8768f
  19. 28 Jan, 2016 1 commit
  20. 11 Jan, 2016 1 commit
  21. 05 Jan, 2016 1 commit
  22. 16 Dec, 2015 1 commit
  23. 29 Sep, 2015 2 commits
    • Karl Heiss's avatar
      sctp: Prevent soft lockup when sctp_accept() is called during a timeout event · 635682a1
      Karl Heiss authored
      A case can occur when sctp_accept() is called by the user during
      a heartbeat timeout event after the 4-way handshake.  Since
      sctp_assoc_migrate() changes both assoc->base.sk and assoc->ep, the
      bh_sock_lock in sctp_generate_heartbeat_event() will be taken with
      the listening socket but released with the new association socket.
      The result is a deadlock on any future attempts to take the listening
      socket lock.
      
      Note that this race can occur with other SCTP timeouts that take
      the bh_lock_sock() in the event sctp_accept() is called.
      
       BUG: soft lockup - CPU#9 stuck for 67s! [swapper:0]
       ...
       RIP: 0010:[<ffffffff8152d48e>]  [<ffffffff8152d48e>] _spin_lock+0x1e/0x30
       RSP: 0018:ffff880028323b20  EFLAGS: 00000206
       RAX: 0000000000000002 RBX: ffff880028323b20 RCX: 0000000000000000
       RDX: 0000000000000000 RSI: ffff880028323be0 RDI: ffff8804632c4b48
       RBP: ffffffff8100bb93 R08: 0000000000000000 R09: 0000000000000000
       R10: ffff880610662280 R11: 0000000000000100 R12: ffff880028323aa0
       R13: ffff8804383c3880 R14: ffff880028323a90 R15: ffffffff81534225
       FS:  0000000000000000(0000) GS:ffff880028320000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
       CR2: 00000000006df528 CR3: 0000000001a85000 CR4: 00000000000006e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Process swapper (pid: 0, threadinfo ffff880616b70000, task ffff880616b6cab0)
       Stack:
       ffff880028323c40 ffffffffa01c2582 ffff880614cfb020 0000000000000000
       <d> 0100000000000000 00000014383a6c44 ffff8804383c3880 ffff880614e93c00
       <d> ffff880614e93c00 0000000000000000 ffff8804632c4b00 ffff8804383c38b8
       Call Trace:
       <IRQ>
       [<ffffffffa01c2582>] ? sctp_rcv+0x492/0xa10 [sctp]
       [<ffffffff8148c559>] ? nf_iterate+0x69/0xb0
       [<ffffffff814974a0>] ? ip_local_deliver_finish+0x0/0x2d0
       [<ffffffff8148c716>] ? nf_hook_slow+0x76/0x120
       [<ffffffff814974a0>] ? ip_local_deliver_finish+0x0/0x2d0
       [<ffffffff8149757d>] ? ip_local_deliver_finish+0xdd/0x2d0
       [<ffffffff81497808>] ? ip_local_deliver+0x98/0xa0
       [<ffffffff81496ccd>] ? ip_rcv_finish+0x12d/0x440
       [<ffffffff81497255>] ? ip_rcv+0x275/0x350
       [<ffffffff8145cfeb>] ? __netif_receive_skb+0x4ab/0x750
       ...
      
      With lockdep debugging:
      
       =====================================
       [ BUG: bad unlock balance detected! ]
       -------------------------------------
       CslRx/12087 is trying to release lock (slock-AF_INET) at:
       [<ffffffffa01bcae0>] sctp_generate_timeout_event+0x40/0xe0 [sctp]
       but there are no more locks to release!
      
       other info that might help us debug this:
       2 locks held by CslRx/12087:
       #0:  (&asoc->timers[i]){+.-...}, at: [<ffffffff8108ce1f>] run_timer_softirq+0x16f/0x3e0
       #1:  (slock-AF_INET){+.-...}, at: [<ffffffffa01bcac3>] sctp_generate_timeout_event+0x23/0xe0 [sctp]
      
      Ensure the socket taken is also the same one that is released by
      saving a copy of the socket before entering the timeout event
      critical section.
      Signed-off-by: default avatarKarl Heiss <kheiss@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      635682a1
    • Karl Heiss's avatar
      sctp: Whitespace fix · f05940e6
      Karl Heiss authored
      Fix indentation in sctp_generate_heartbeat_event.
      Signed-off-by: default avatarKarl Heiss <kheiss@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f05940e6
  24. 29 Aug, 2015 1 commit
  25. 28 Aug, 2015 1 commit
    • lucien's avatar
      sctp: donot reset the overall_error_count in SHUTDOWN_RECEIVE state · f648f807
      lucien authored
      Commit f8d96052 ("sctp: Enforce retransmission limit during shutdown")
      fixed a problem with excessive retransmissions in the SHUTDOWN_PENDING by not
      resetting the association overall_error_count.  This allowed the association
      to better enforce assoc.max_retrans limit.
      
      However, the same issue still exists when the association is in SHUTDOWN_RECEIVED
      state.  In this state, HB-ACKs will continue to reset the overall_error_count
      for the association would extend the lifetime of association unnecessarily.
      
      This patch solves this by resetting the overall_error_count whenever the current
      state is small then SCTP_STATE_SHUTDOWN_PENDING.  As a small side-effect, we
      end up also handling SCTP_STATE_SHUTDOWN_ACK_SENT and SCTP_STATE_SHUTDOWN_SENT
      states, but they are not really impacted because we disable Heartbeats in those
      states.
      
      Fixes: Commit f8d96052 ("sctp: Enforce retransmission limit during shutdown")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f648f807
  26. 28 Apr, 2014 1 commit
  27. 20 Feb, 2014 1 commit
    • Matija Glavinic Pecotic's avatar
      net: sctp: Potentially-Failed state should not be reached from unconfirmed state · 7cce3b75
      Matija Glavinic Pecotic authored
      In current implementation it is possible to reach PF state from unconfirmed.
      We can interpret sctp-failover-02 in a way that PF state is meant to be reached
      only from active state, in the end, this is when entering PF state makes sense.
      Here are few quotes from sctp-failover-02, but regardless of these, same
      understanding can be reached from whole section 5:
      
      Section 5.1, quickfailover guide:
          "The PF state is an intermediate state between Active and Failed states."
      
          "Each time the T3-rtx timer expires on an active or idle
          destination, the error counter of that destination address will
          be incremented.  When the value in the error counter exceeds
          PFMR, the endpoint should mark the destination transport address as PF."
      
      There are several concrete reasons for such interpretation. For start, rfc4960
      does not take into concern quickfailover algorithm. Therefore, quickfailover
      must comply to 4960. Point where this compliance can be argued is following
      behavior:
      When PF is entered, association overall error counter is incremented for each
      missed HB. This is contradictory to rfc4960, as address, while in unconfirmed
      state, is subjected to probing, and while it is probed, it should not increment
      association overall error counter. This has as a consequence that we might end
      up in situation in which we drop association due path failure on unconfirmed
      address, in case we have wrong configuration in a way:
      Association.Max.Retrans == Path.Max.Retrans.
      
      Another reason is that entering PF from unconfirmed will cause a loss of address
      confirmed event when address is once (if) confirmed. This is fine from failover
      guide point of view, but it is not consistent with behavior preceding failover
      implementation and recommendation from 4960:
      
      5.4.  Path Verification
         Whenever a path is confirmed, an indication MAY be given to the upper
         layer.
      Signed-off-by: default avatarMatija Glavinic Pecotic <matija.glavinic-pecotic.ext@nsn.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7cce3b75
  28. 22 Jan, 2014 1 commit