1. 27 Apr, 2019 40 commits
    • Masami Hiramatsu's avatar
      kprobes: Mark ftrace mcount handler functions nokprobe · 18a0a7c1
      Masami Hiramatsu authored
      commit fabe38ab6b2bd9418350284c63825f13b8a6abba upstream.
      
      Mark ftrace mcount handler functions nokprobe since
      probing on these functions with kretprobe pushes
      return address incorrectly on kretprobe shadow stack.
      Reported-by: default avatarFrancis Deslauriers <francis.deslauriers@efficios.com>
      Tested-by: default avatarAndrea Righi <righi.andrea@gmail.com>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/155094062044.6137.6419622920568680640.stgit@devboxSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      18a0a7c1
    • Masami Hiramatsu's avatar
      x86/kprobes: Verify stack frame on kretprobe · 877e9c51
      Masami Hiramatsu authored
      commit 3ff9c075cc767b3060bdac12da72fc94dd7da1b8 upstream.
      
      Verify the stack frame pointer on kretprobe trampoline handler,
      If the stack frame pointer does not match, it skips the wrong
      entry and tries to find correct one.
      
      This can happen if user puts the kretprobe on the function
      which can be used in the path of ftrace user-function call.
      Such functions should not be probed, so this adds a warning
      message that reports which function should be blacklisted.
      Tested-by: default avatarAndrea Righi <righi.andrea@gmail.com>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/155094059185.6137.15527904013362842072.stgit@devboxSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      877e9c51
    • Nathan Chancellor's avatar
      arm64: futex: Restore oldval initialization to work around buggy compilers · 1f2b61e4
      Nathan Chancellor authored
      commit ff8acf929014b7f87315588e0daf8597c8aa9d1c upstream.
      
      Commit 045afc24124d ("arm64: futex: Fix FUTEX_WAKE_OP atomic ops with
      non-zero result value") removed oldval's zero initialization in
      arch_futex_atomic_op_inuser because it is not necessary. Unfortunately,
      Android's arm64 GCC 4.9.4 [1] does not agree:
      
      ../kernel/futex.c: In function 'do_futex':
      ../kernel/futex.c:1658:17: warning: 'oldval' may be used uninitialized
      in this function [-Wmaybe-uninitialized]
         return oldval == cmparg;
                       ^
      In file included from ../kernel/futex.c:73:0:
      ../arch/arm64/include/asm/futex.h:53:6: note: 'oldval' was declared here
        int oldval, ret, tmp;
            ^
      
      GCC fails to follow that when ret is non-zero, futex_atomic_op_inuser
      returns right away, avoiding the uninitialized use that it claims.
      Restoring the zero initialization works around this issue.
      
      [1]: https://android.googlesource.com/platform/prebuilts/gcc/linux-x86/aarch64/aarch64-linux-android-4.9/
      
      Cc: stable@vger.kernel.org
      Fixes: 045afc24124d ("arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value")
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f2b61e4
    • Eric Biggers's avatar
      crypto: x86/poly1305 - fix overflow during partial reduction · 4de25ac0
      Eric Biggers authored
      commit 678cce4019d746da6c680c48ba9e6d417803e127 upstream.
      
      The x86_64 implementation of Poly1305 produces the wrong result on some
      inputs because poly1305_4block_avx2() incorrectly assumes that when
      partially reducing the accumulator, the bits carried from limb 'd4' to
      limb 'h0' fit in a 32-bit integer.  This is true for poly1305-generic
      which processes only one block at a time.  However, it's not true for
      the AVX2 implementation, which processes 4 blocks at a time and
      therefore can produce intermediate limbs about 4x larger.
      
      Fix it by making the relevant calculations use 64-bit arithmetic rather
      than 32-bit.  Note that most of the carries already used 64-bit
      arithmetic, but the d4 -> h0 carry was different for some reason.
      
      To be safe I also made the same change to the corresponding SSE2 code,
      though that only operates on 1 or 2 blocks at a time.  I don't think
      it's really needed for poly1305_block_sse2(), but it doesn't hurt
      because it's already x86_64 code.  It *might* be needed for
      poly1305_2block_sse2(), but overflows aren't easy to reproduce there.
      
      This bug was originally detected by my patches that improve testmgr to
      fuzz algorithms against their generic implementation.  But also add a
      test vector which reproduces it directly (in the AVX2 case).
      
      Fixes: b1ccc8f4 ("crypto: poly1305 - Add a four block AVX2 variant for x86_64")
      Fixes: c70f4abe ("crypto: poly1305 - Add a SSE2 SIMD variant for x86_64")
      Cc: <stable@vger.kernel.org> # v4.3+
      Cc: Martin Willi <martin@strongswan.org>
      Cc: Jason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Reviewed-by: default avatarMartin Willi <martin@strongswan.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4de25ac0
    • Andrea Arcangeli's avatar
      coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping · bb461ad8
      Andrea Arcangeli authored
      commit 04f5866e41fb70690e28397487d8bd8eea7d712a upstream.
      
      The core dumping code has always run without holding the mmap_sem for
      writing, despite that is the only way to ensure that the entire vma
      layout will not change from under it.  Only using some signal
      serialization on the processes belonging to the mm is not nearly enough.
      This was pointed out earlier.  For example in Hugh's post from Jul 2017:
      
        https://lkml.kernel.org/r/alpine.LSU.2.11.1707191716030.2055@eggly.anvils
      
        "Not strictly relevant here, but a related note: I was very surprised
         to discover, only quite recently, how handle_mm_fault() may be called
         without down_read(mmap_sem) - when core dumping. That seems a
         misguided optimization to me, which would also be nice to correct"
      
      In particular because the growsdown and growsup can move the
      vm_start/vm_end the various loops the core dump does around the vma will
      not be consistent if page faults can happen concurrently.
      
      Pretty much all users calling mmget_not_zero()/get_task_mm() and then
      taking the mmap_sem had the potential to introduce unexpected side
      effects in the core dumping code.
      
      Adding mmap_sem for writing around the ->core_dump invocation is a
      viable long term fix, but it requires removing all copy user and page
      faults and to replace them with get_dump_page() for all binary formats
      which is not suitable as a short term fix.
      
      For the time being this solution manually covers the places that can
      confuse the core dump either by altering the vma layout or the vma flags
      while it runs.  Once ->core_dump runs under mmap_sem for writing the
      function mmget_still_valid() can be dropped.
      
      Allowing mmap_sem protected sections to run in parallel with the
      coredump provides some minor parallelism advantage to the swapoff code
      (which seems to be safe enough by never mangling any vma field and can
      keep doing swapins in parallel to the core dumping) and to some other
      corner case.
      
      In order to facilitate the backporting I added "Fixes: 86039bd3"
      however the side effect of this same race condition in /proc/pid/mem
      should be reproducible since before 2.6.12-rc2 so I couldn't add any
      other "Fixes:" because there's no hash beyond the git genesis commit.
      
      Because find_extend_vma() is the only location outside of the process
      context that could modify the "mm" structures under mmap_sem for
      reading, by adding the mmget_still_valid() check to it, all other cases
      that take the mmap_sem for reading don't need the new check after
      mmget_not_zero()/get_task_mm().  The expand_stack() in page fault
      context also doesn't need the new check, because all tasks under core
      dumping are frozen.
      
      Link: http://lkml.kernel.org/r/20190325224949.11068-1-aarcange@redhat.com
      Fixes: 86039bd3 ("userfaultfd: add new syscall to provide memory externalization")
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reported-by: default avatarJann Horn <jannh@google.com>
      Suggested-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reviewed-by: default avatarJann Horn <jannh@google.com>
      Acked-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bb461ad8
    • Suthikulpanit, Suravee's avatar
      Revert "svm: Fix AVIC incomplete IPI emulation" · c197e469
      Suthikulpanit, Suravee authored
      commit 4a58038b9e420276157785afa0a0bbb4b9bc2265 upstream.
      
      This reverts commit bb218fbcfaaa3b115d4cd7a43c0ca164f3a96e57.
      
      As Oren Twaig pointed out the old discussion:
      
        https://patchwork.kernel.org/patch/8292231/
      
      that the change coud potentially cause an extra IPI to be sent to
      the destination vcpu because the AVIC hardware already set the IRR bit
      before the incomplete IPI #VMEXIT with id=1 (target vcpu is not running).
      Since writting to ICR and ICR2 will also set the IRR. If something triggers
      the destination vcpu to get scheduled before the emulation finishes, then
      this could result in an additional IPI.
      
      Also, the issue mentioned in the commit bb218fbcfaaa was misdiagnosed.
      
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Reported-by: default avatarOren Twaig <oren@scalemp.com>
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c197e469
    • Saurav Kashyap's avatar
      Revert "scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO" · bc6b83db
      Saurav Kashyap authored
      commit 0228034d8e5915b98c33db35a98f5e909e848ae9 upstream.
      
      This patch clears FC_RP_STARTED flag during logoff, because of this
      re-login(flogi) didn't happen to the switch.
      
      This reverts commit 1550ec45.
      
      Fixes: 1550ec45 ("scsi: fcoe: clear FC_RP_STARTED flags when receiving a LOGO")
      Cc: <stable@vger.kernel.org> # v4.18+
      Signed-off-by: default avatarSaurav Kashyap <skashyap@marvell.com>
      Reviewed-by: default avatarHannes Reinecke <hare@#suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bc6b83db
    • Jaesoo Lee's avatar
      scsi: core: set result when the command cannot be dispatched · 49c67980
      Jaesoo Lee authored
      commit be549d49115422f846b6d96ee8fd7173a5f7ceb0 upstream.
      
      When SCSI blk-mq is enabled, there is a bug in handling errors in
      scsi_queue_rq.  Specifically, the bug is not setting result field of
      scsi_request correctly when the dispatch of the command has been
      failed. Since the upper layer code including the sg_io ioctl expects to
      receive any error status from result field of scsi_request, the error is
      silently ignored and this could cause data corruptions for some
      applications.
      
      Fixes: d285203c ("scsi: add support for a blk-mq based I/O path.")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJaesoo Lee <jalee@purestorage.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      49c67980
    • Takashi Iwai's avatar
      ALSA: core: Fix card races between register and disconnect · d11a33e9
      Takashi Iwai authored
      commit 2a3f7221acddfe1caa9ff09b3a8158c39b2fdeac upstream.
      
      There is a small race window in the card disconnection code that
      allows the registration of another card with the very same card id.
      This leads to a warning in procfs creation as caught by syzkaller.
      
      The problem is that we delete snd_cards and snd_cards_lock entries at
      the very beginning of the disconnection procedure.  This makes the
      slot available to be assigned for another card object while the
      disconnection procedure is being processed.  Then it becomes possible
      to issue a procfs registration with the existing file name although we
      check the conflict beforehand.
      
      The fix is simply to move the snd_cards and snd_cards_lock clearances
      at the end of the disconnection procedure.  The references to these
      entries are merely either from the global proc files like
      /proc/asound/cards or from the card registration / disconnection, so
      it should be fine to shift at the very end.
      
      Reported-by: syzbot+48df349490c36f9f54ab@syzkaller.appspotmail.com
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d11a33e9
    • Hui Wang's avatar
      ALSA: hda/realtek - add two more pin configuration sets to quirk table · cfb5fb04
      Hui Wang authored
      commit b26e36b7ef36a8a3a147b1609b2505f8a4ecf511 upstream.
      
      We have two Dell laptops which have the codec 10ec0236 and 10ec0256
      respectively, the headset mic on them can't work, need to apply the
      quirk of ALC255_FIXUP_DELL1_MIC_NO_PRESENCE. So adding their pin
      configurations in the pin quirk table.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarHui Wang <hui.wang@canonical.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cfb5fb04
    • Ian Abbott's avatar
      staging: comedi: ni_usb6501: Fix possible double-free of ->usb_rx_buf · 9ae4c50f
      Ian Abbott authored
      commit af4b54a2e5ba18259ff9aac445bf546dd60d037e upstream.
      
      `ni6501_alloc_usb_buffers()` is called from `ni6501_auto_attach()` to
      allocate RX and TX buffers for USB transfers.  It allocates
      `devpriv->usb_rx_buf` followed by `devpriv->usb_tx_buf`.  If the
      allocation of `devpriv->usb_tx_buf` fails, it frees
      `devpriv->usb_rx_buf`, leaving the pointer set dangling, and returns an
      error.  Later, `ni6501_detach()` will be called from the core comedi
      module code to clean up.  `ni6501_detach()` also frees both
      `devpriv->usb_rx_buf` and `devpriv->usb_tx_buf`, but
      `devpriv->usb_rx_buf` may have already beed freed, leading to a
      double-free error.  Fix it bu removing the call to
      `kfree(devpriv->usb_rx_buf)` from `ni6501_alloc_usb_buffers()`, relying
      on `ni6501_detach()` to free the memory.
      Signed-off-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ae4c50f
    • Ian Abbott's avatar
      staging: comedi: ni_usb6501: Fix use of uninitialized mutex · 72cd1a32
      Ian Abbott authored
      commit 660cf4ce9d0f3497cc7456eaa6d74c8b71d6282c upstream.
      
      If `ni6501_auto_attach()` returns an error, the core comedi module code
      will call `ni6501_detach()` to clean up.  If `ni6501_auto_attach()`
      successfully allocated the comedi device private data, `ni6501_detach()`
      assumes that a `struct mutex mut` contained in the private data has been
      initialized and uses it.  Unfortunately, there are a couple of places
      where `ni6501_auto_attach()` can return an error after allocating the
      device private data but before initializing the mutex, so this
      assumption is invalid.  Fix it by initializing the mutex just after
      allocating the private data in `ni6501_auto_attach()` before any other
      errors can be retturned.  Also move the call to `usb_set_intfdata()`
      just to keep the code a bit neater (either position for the call is
      fine).
      
      I believe this was the cause of the following syzbot crash report
      <https://syzkaller.appspot.com/bug?extid=cf4f2b6c24aff0a3edf6>:
      
      usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
      usb 1-1: config 0 descriptor??
      usb 1-1: string descriptor 0 read error: -71
      comedi comedi0: Wrong number of endpoints
      ni6501 1-1:0.233: driver 'ni6501' failed to auto-configure device.
      INFO: trying to register non-static key.
      the code is fine but needs lockdep annotation.
      turning off the locking correctness validator.
      CPU: 0 PID: 585 Comm: kworker/0:3 Not tainted 5.1.0-rc4-319354-g9a33b36 #3
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: usb_hub_wq hub_event
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0xe8/0x16e lib/dump_stack.c:113
       assign_lock_key kernel/locking/lockdep.c:786 [inline]
       register_lock_class+0x11b8/0x1250 kernel/locking/lockdep.c:1095
       __lock_acquire+0xfb/0x37c0 kernel/locking/lockdep.c:3582
       lock_acquire+0x10d/0x2f0 kernel/locking/lockdep.c:4211
       __mutex_lock_common kernel/locking/mutex.c:925 [inline]
       __mutex_lock+0xfe/0x12b0 kernel/locking/mutex.c:1072
       ni6501_detach+0x5b/0x110 drivers/staging/comedi/drivers/ni_usb6501.c:567
       comedi_device_detach+0xed/0x800 drivers/staging/comedi/drivers.c:204
       comedi_device_cleanup.part.0+0x68/0x140 drivers/staging/comedi/comedi_fops.c:156
       comedi_device_cleanup drivers/staging/comedi/comedi_fops.c:187 [inline]
       comedi_free_board_dev.part.0+0x16/0x90 drivers/staging/comedi/comedi_fops.c:190
       comedi_free_board_dev drivers/staging/comedi/comedi_fops.c:189 [inline]
       comedi_release_hardware_device+0x111/0x140 drivers/staging/comedi/comedi_fops.c:2880
       comedi_auto_config.cold+0x124/0x1b0 drivers/staging/comedi/drivers.c:1068
       usb_probe_interface+0x31d/0x820 drivers/usb/core/driver.c:361
       really_probe+0x2da/0xb10 drivers/base/dd.c:509
       driver_probe_device+0x21d/0x350 drivers/base/dd.c:671
       __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778
       bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454
       __device_attach+0x223/0x3a0 drivers/base/dd.c:844
       bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514
       device_add+0xad2/0x16e0 drivers/base/core.c:2106
       usb_set_configuration+0xdf7/0x1740 drivers/usb/core/message.c:2021
       generic_probe+0xa2/0xda drivers/usb/core/generic.c:210
       usb_probe_device+0xc0/0x150 drivers/usb/core/driver.c:266
       really_probe+0x2da/0xb10 drivers/base/dd.c:509
       driver_probe_device+0x21d/0x350 drivers/base/dd.c:671
       __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778
       bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454
       __device_attach+0x223/0x3a0 drivers/base/dd.c:844
       bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514
       device_add+0xad2/0x16e0 drivers/base/core.c:2106
       usb_new_device.cold+0x537/0xccf drivers/usb/core/hub.c:2534
       hub_port_connect drivers/usb/core/hub.c:5089 [inline]
       hub_port_connect_change drivers/usb/core/hub.c:5204 [inline]
       port_event drivers/usb/core/hub.c:5350 [inline]
       hub_event+0x138e/0x3b00 drivers/usb/core/hub.c:5432
       process_one_work+0x90f/0x1580 kernel/workqueue.c:2269
       worker_thread+0x9b/0xe20 kernel/workqueue.c:2415
       kthread+0x313/0x420 kernel/kthread.c:253
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
      
      Reported-by: syzbot+cf4f2b6c24aff0a3edf6@syzkaller.appspotmail.com
      Signed-off-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      72cd1a32
    • Ian Abbott's avatar
      staging: comedi: vmk80xx: Fix possible double-free of ->usb_rx_buf · 9de6de26
      Ian Abbott authored
      commit 663d294b4768bfd89e529e069bffa544a830b5bf upstream.
      
      `vmk80xx_alloc_usb_buffers()` is called from `vmk80xx_auto_attach()` to
      allocate RX and TX buffers for USB transfers.  It allocates
      `devpriv->usb_rx_buf` followed by `devpriv->usb_tx_buf`.  If the
      allocation of `devpriv->usb_tx_buf` fails, it frees
      `devpriv->usb_rx_buf`,  leaving the pointer set dangling, and returns an
      error.  Later, `vmk80xx_detach()` will be called from the core comedi
      module code to clean up.  `vmk80xx_detach()` also frees both
      `devpriv->usb_rx_buf` and `devpriv->usb_tx_buf`, but
      `devpriv->usb_rx_buf` may have already been freed, leading to a
      double-free error.  Fix it by removing the call to
      `kfree(devpriv->usb_rx_buf)` from `vmk80xx_alloc_usb_buffers()`, relying
      on `vmk80xx_detach()` to free the memory.
      Signed-off-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9de6de26
    • Ian Abbott's avatar
      staging: comedi: vmk80xx: Fix use of uninitialized semaphore · 586f669a
      Ian Abbott authored
      commit 08b7c2f9208f0e2a32159e4e7a4831b7adb10a3e upstream.
      
      If `vmk80xx_auto_attach()` returns an error, the core comedi module code
      will call `vmk80xx_detach()` to clean up.  If `vmk80xx_auto_attach()`
      successfully allocated the comedi device private data,
      `vmk80xx_detach()` assumes that a `struct semaphore limit_sem` contained
      in the private data has been initialized and uses it.  Unfortunately,
      there are a couple of places where `vmk80xx_auto_attach()` can return an
      error after allocating the device private data but before initializing
      the semaphore, so this assumption is invalid.  Fix it by initializing
      the semaphore just after allocating the private data in
      `vmk80xx_auto_attach()` before any other errors can be returned.
      
      I believe this was the cause of the following syzbot crash report
      <https://syzkaller.appspot.com/bug?extid=54c2f58f15fe6876b6ad>:
      
      usb 1-1: config 0 has no interface number 0
      usb 1-1: New USB device found, idVendor=10cf, idProduct=8068, bcdDevice=e6.8d
      usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
      usb 1-1: config 0 descriptor??
      vmk80xx 1-1:0.117: driver 'vmk80xx' failed to auto-configure device.
      INFO: trying to register non-static key.
      the code is fine but needs lockdep annotation.
      turning off the locking correctness validator.
      CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.1.0-rc4-319354-g9a33b36 #3
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: usb_hub_wq hub_event
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0xe8/0x16e lib/dump_stack.c:113
       assign_lock_key kernel/locking/lockdep.c:786 [inline]
       register_lock_class+0x11b8/0x1250 kernel/locking/lockdep.c:1095
       __lock_acquire+0xfb/0x37c0 kernel/locking/lockdep.c:3582
       lock_acquire+0x10d/0x2f0 kernel/locking/lockdep.c:4211
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x44/0x60 kernel/locking/spinlock.c:152
       down+0x12/0x80 kernel/locking/semaphore.c:58
       vmk80xx_detach+0x59/0x100 drivers/staging/comedi/drivers/vmk80xx.c:829
       comedi_device_detach+0xed/0x800 drivers/staging/comedi/drivers.c:204
       comedi_device_cleanup.part.0+0x68/0x140 drivers/staging/comedi/comedi_fops.c:156
       comedi_device_cleanup drivers/staging/comedi/comedi_fops.c:187 [inline]
       comedi_free_board_dev.part.0+0x16/0x90 drivers/staging/comedi/comedi_fops.c:190
       comedi_free_board_dev drivers/staging/comedi/comedi_fops.c:189 [inline]
       comedi_release_hardware_device+0x111/0x140 drivers/staging/comedi/comedi_fops.c:2880
       comedi_auto_config.cold+0x124/0x1b0 drivers/staging/comedi/drivers.c:1068
       usb_probe_interface+0x31d/0x820 drivers/usb/core/driver.c:361
       really_probe+0x2da/0xb10 drivers/base/dd.c:509
       driver_probe_device+0x21d/0x350 drivers/base/dd.c:671
       __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778
       bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454
       __device_attach+0x223/0x3a0 drivers/base/dd.c:844
       bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514
       device_add+0xad2/0x16e0 drivers/base/core.c:2106
       usb_set_configuration+0xdf7/0x1740 drivers/usb/core/message.c:2021
       generic_probe+0xa2/0xda drivers/usb/core/generic.c:210
       usb_probe_device+0xc0/0x150 drivers/usb/core/driver.c:266
       really_probe+0x2da/0xb10 drivers/base/dd.c:509
       driver_probe_device+0x21d/0x350 drivers/base/dd.c:671
       __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778
       bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454
       __device_attach+0x223/0x3a0 drivers/base/dd.c:844
       bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514
       device_add+0xad2/0x16e0 drivers/base/core.c:2106
       usb_new_device.cold+0x537/0xccf drivers/usb/core/hub.c:2534
       hub_port_connect drivers/usb/core/hub.c:5089 [inline]
       hub_port_connect_change drivers/usb/core/hub.c:5204 [inline]
       port_event drivers/usb/core/hub.c:5350 [inline]
       hub_event+0x138e/0x3b00 drivers/usb/core/hub.c:5432
       process_one_work+0x90f/0x1580 kernel/workqueue.c:2269
       worker_thread+0x9b/0xe20 kernel/workqueue.c:2415
       kthread+0x313/0x420 kernel/kthread.c:253
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
      
      Reported-by: syzbot+54c2f58f15fe6876b6ad@syzkaller.appspotmail.com
      Signed-off-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      586f669a
    • he, bo's avatar
      io: accel: kxcjk1013: restore the range after resume. · de970d4e
      he, bo authored
      commit fe2d3df639a7940a125a33d6460529b9689c5406 upstream.
      
      On some laptops, kxcjk1013 is powered off when system enters S3. We need
      restore the range regiter during resume. Otherwise, the sensor doesn't
      work properly after S3.
      Signed-off-by: default avatarhe, bo <bo.he@intel.com>
      Signed-off-by: default avatarChen, Hu <hu1.chen@intel.com>
      Reviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      de970d4e
    • Fabrice Gasnier's avatar
      iio: core: fix a possible circular locking dependency · 0386fd65
      Fabrice Gasnier authored
      commit 7f75591fc5a123929a29636834d1bcb8b5c9fee3 upstream.
      
      This fixes a possible circular locking dependency detected warning seen
      with:
      - CONFIG_PROVE_LOCKING=y
      - consumer/provider IIO devices (ex: "voltage-divider" consumer of "adc")
      
      When using the IIO consumer interface, e.g. iio_channel_get(), the consumer
      device will likely call iio_read_channel_raw() or similar that rely on
      'info_exist_lock' mutex.
      
      typically:
      ...
      	mutex_lock(&chan->indio_dev->info_exist_lock);
      	if (chan->indio_dev->info == NULL) {
      		ret = -ENODEV;
      		goto err_unlock;
      	}
      	ret = do_some_ops()
      err_unlock:
      	mutex_unlock(&chan->indio_dev->info_exist_lock);
      	return ret;
      ...
      
      Same mutex is also hold in iio_device_unregister().
      
      The following deadlock warning happens when:
      - the consumer device has called an API like iio_read_channel_raw()
        at least once.
      - the consumer driver is unregistered, removed (unbind from sysfs)
      
      ======================================================
      WARNING: possible circular locking dependency detected
      4.19.24 #577 Not tainted
      ------------------------------------------------------
      sh/372 is trying to acquire lock:
      (kn->count#30){++++}, at: kernfs_remove_by_name_ns+0x3c/0x84
      
      but task is already holding lock:
      (&dev->info_exist_lock){+.+.}, at: iio_device_unregister+0x18/0x60
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (&dev->info_exist_lock){+.+.}:
             __mutex_lock+0x70/0xa3c
             mutex_lock_nested+0x1c/0x24
             iio_read_channel_raw+0x1c/0x60
             iio_read_channel_info+0xa8/0xb0
             dev_attr_show+0x1c/0x48
             sysfs_kf_seq_show+0x84/0xec
             seq_read+0x154/0x528
             __vfs_read+0x2c/0x15c
             vfs_read+0x8c/0x110
             ksys_read+0x4c/0xac
             ret_fast_syscall+0x0/0x28
             0xbedefb60
      
      -> #0 (kn->count#30){++++}:
             lock_acquire+0xd8/0x268
             __kernfs_remove+0x288/0x374
             kernfs_remove_by_name_ns+0x3c/0x84
             remove_files+0x34/0x78
             sysfs_remove_group+0x40/0x9c
             sysfs_remove_groups+0x24/0x34
             device_remove_attrs+0x38/0x64
             device_del+0x11c/0x360
             cdev_device_del+0x14/0x2c
             iio_device_unregister+0x24/0x60
             release_nodes+0x1bc/0x200
             device_release_driver_internal+0x1a0/0x230
             unbind_store+0x80/0x130
             kernfs_fop_write+0x100/0x1e4
             __vfs_write+0x2c/0x160
             vfs_write+0xa4/0x17c
             ksys_write+0x4c/0xac
             ret_fast_syscall+0x0/0x28
             0xbe906840
      
      other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&dev->info_exist_lock);
                                     lock(kn->count#30);
                                     lock(&dev->info_exist_lock);
        lock(kn->count#30);
      
       *** DEADLOCK ***
      ...
      
      cdev_device_del() can be called without holding the lock. It should be safe
      as info_exist_lock prevents kernelspace consumers to use the exported
      routines during/after provider removal. cdev_device_del() is for userspace.
      
      Help to reproduce:
      See example: Documentation/devicetree/bindings/iio/afe/voltage-divider.txt
      sysv {
      	compatible = "voltage-divider";
      	io-channels = <&adc 0>;
      	output-ohms = <22>;
      	full-ohms = <222>;
      };
      
      First, go to iio:deviceX for the "voltage-divider", do one read:
      $ cd /sys/bus/iio/devices/iio:deviceX
      $ cat in_voltage0_raw
      
      Then, unbind the consumer driver. It triggers above deadlock warning.
      $ cd /sys/bus/platform/drivers/iio-rescale/
      $ echo sysv > unbind
      
      Note I don't actually expect stable will pick this up all the
      way back into IIO being in staging, but if's probably valid that
      far back.
      Signed-off-by: default avatarFabrice Gasnier <fabrice.gasnier@st.com>
      Fixes: ac917a81 ("staging:iio:core set the iio_dev.info pointer to null on unregister")
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0386fd65
    • Georg Ottinger's avatar
      iio: adc: at91: disable adc channel interrupt in timeout case · fe400daf
      Georg Ottinger authored
      commit 09c6bdee51183a575bf7546890c8c137a75a2b44 upstream.
      
      Having a brief look at at91_adc_read_raw() it is obvious that in the case
      of a timeout the setting of AT91_ADC_CHDR and AT91_ADC_IDR registers is
      omitted. If 2 different channels are queried we can end up with a
      situation where two interrupts are enabled, but only one interrupt is
      cleared in the interrupt handler. Resulting in a interrupt loop and a
      system hang.
      Signed-off-by: default avatarGeorg Ottinger <g.ottinger@abatec.at>
      Acked-by: default avatarLudovic Desroches <ludovic.desroches@microchip.com>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fe400daf
    • Lars-Peter Clausen's avatar
      iio: Fix scan mask selection · 1ae7297b
      Lars-Peter Clausen authored
      commit 20ea39ef9f2f911bd01c69519e7d69cfec79fde3 upstream.
      
      The trialmask is expected to have all bits set to 0 after allocation.
      Currently kmalloc_array() is used which does not zero the memory and so
      random bits are set. This results in random channels being enabled when
      they shouldn't. Replace kmalloc_array() with kcalloc() which has the same
      interface but zeros the memory.
      
      Note the fix is actually required earlier than the below fixes tag, but
      will require a manual backport due to move from kmalloc to kmalloc_array.
      Signed-off-by: default avatarLars-Peter Clausen <lars@metafoo.de>
      Signed-off-by: default avatarAlexandru Ardelean <alexandru.ardelean@analog.com>
      Fixes commit 057ac1ac ("iio: Use kmalloc_array() in iio_scan_mask_set()").
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ae7297b
    • Jean-Francois Dagenais's avatar
      iio: dac: mcp4725: add missing powerdown bits in store eeprom · 8ba5d597
      Jean-Francois Dagenais authored
      commit 06003531502d06bc89d32528f6ec96bf978790f9 upstream.
      
      When issuing the write DAC register and write eeprom command, the two
      powerdown bits (PD0 and PD1) are assumed by the chip to be present in
      the bytes sent. Leaving them at 0 implies "powerdown disabled" which is
      a different state that the current one. By adding the current state of
      the powerdown in the i2c write, the chip will correctly power-on exactly
      like as it is at the moment of store_eeprom call.
      
      This is documented in MCP4725's datasheet, FIGURE 6-2: "Write Commands
      for DAC Input Register and EEPROM" and MCP4726's datasheet, FIGURE 6-3:
      "Write All Memory Command".
      Signed-off-by: default avatarJean-Francois Dagenais <jeff.dagenais@gmail.com>
      Acked-by: default avatarPeter Meerwald-Stadler <pmeerw@pmeerw.net>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8ba5d597
    • Dragos Bogdan's avatar
      iio: ad_sigma_delta: select channel when reading register · ad0f65cd
      Dragos Bogdan authored
      commit fccfb9ce70ed4ea7a145f77b86de62e38178517f upstream.
      
      The desired channel has to be selected in order to correctly fill the
      buffer with the corresponding data.
      The `ad_sd_write_reg()` already does this, but for the
      `ad_sd_read_reg_raw()` this was omitted.
      
      Fixes: af300848 ("iio:adc: Add common code for ADI Sigma Delta devices")
      Signed-off-by: default avatarDragos Bogdan <dragos.bogdan@analog.com>
      Signed-off-by: default avatarAlexandru Ardelean <alexandru.ardelean@analog.com>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad0f65cd
    • Gwendal Grignou's avatar
      iio: cros_ec: Fix the maths for gyro scale calculation · 03363057
      Gwendal Grignou authored
      commit 3d02d7082e5823598090530c3988a35f69689943 upstream.
      
      Calculation did not use IIO_DEGREE_TO_RAD and implemented a variant to
      avoid precision loss as we aim a nano value. The offset added to avoid
      rounding error, though, doesn't give us a close result to the expected
      value. E.g.
      
      For 1000dps, the result should be:
      
          (1000 * pi ) / 180 >> 15 ~= 0.000532632218
      
      But with current calculation we get
      
          $ cat scale
          0.000547890
      
      Fix the calculation by just doing the maths involved for a nano value
      
         val * pi * 10e12 / (180 * 2^15)
      
      so we get a closer result.
      
          $ cat scale
          0.000532632
      
      Fixes: c14dca07 ("iio: cros_ec_sensors: add ChromeOS EC Contiguous Sensors driver")
      Signed-off-by: default avatarGwendal Grignou <gwendal@chromium.org>
      Signed-off-by: default avatarEnric Balletbo i Serra <enric.balletbo@collabora.com>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      03363057
    • Mike Looijmans's avatar
      iio/gyro/bmg160: Use millidegrees for temperature scale · f245ce44
      Mike Looijmans authored
      commit 40a7198a4a01037003c7ca714f0d048a61e729ac upstream.
      
      Standard unit for temperature is millidegrees Celcius, whereas this driver
      was reporting in degrees. Fix the scale factor in the driver.
      Signed-off-by: default avatarMike Looijmans <mike.looijmans@topic.nl>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f245ce44
    • Sergey Larin's avatar
      iio: gyro: mpu3050: fix chip ID reading · b7ad00c0
      Sergey Larin authored
      commit 409a51e0a4a5f908763191fae2c29008632eb712 upstream.
      
      According to the datasheet, the last bit of CHIP_ID register controls
      I2C bus, and the first one is unused. Handle this correctly.
      
      Note that there are chips out there that have a value such that
      the id check currently fails.
      Signed-off-by: default avatarSergey Larin <cerg2010cerg2010@mail.ru>
      Reviewed-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b7ad00c0
    • Mircea Caprioru's avatar
      staging: iio: ad7192: Fix ad7193 channel address · 63aafa65
      Mircea Caprioru authored
      commit 7ce0f216221856a17fc4934b39284678a5fef2e9 upstream.
      
      This patch fixes the differential channels addresses for the ad7193.
      Signed-off-by: default avatarMircea Caprioru <mircea.caprioru@analog.com>
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      63aafa65
    • Leonard Pollak's avatar
      Staging: iio: meter: fixed typo · b69e4d5f
      Leonard Pollak authored
      commit 0a8a29be499cbb67df79370aaf5109085509feb8 upstream.
      
      This patch fixes an obvious typo, which will cause erroneously returning the Peak
      Voltage instead of the Peak Current.
      Signed-off-by: default avatarLeonard Pollak <leonardp@tr-host.de>
      Cc: <Stable@vger.kernel.org>
      Acked-by: default avatarMichael Hennerich <michael.hennerich@analog.com>
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b69e4d5f
    • Vitaly Kuznetsov's avatar
      KVM: x86: svm: make sure NMI is injected after nmi_singlestep · 8091c2ae
      Vitaly Kuznetsov authored
      commit 99c221796a810055974b54c02e8f53297e48d146 upstream.
      
      I noticed that apic test from kvm-unit-tests always hangs on my EPYC 7401P,
      the hanging test nmi-after-sti is trying to deliver 30000 NMIs and tracing
      shows that we're sometimes able to deliver a few but never all.
      
      When we're trying to inject an NMI we may fail to do so immediately for
      various reasons, however, we still need to inject it so enable_nmi_window()
      arms nmi_singlestep mode. #DB occurs as expected, but we're not checking
      for pending NMIs before entering the guest and unless there's a different
      event to process, the NMI will never get delivered.
      
      Make KVM_REQ_EVENT request on the vCPU from db_interception() to make sure
      pending NMIs are checked and possibly injected.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8091c2ae
    • Sean Christopherson's avatar
      KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPU · bc191940
      Sean Christopherson authored
      commit 8f4dc2e77cdfaf7e644ef29693fa229db29ee1de upstream.
      
      Neither AMD nor Intel CPUs have an EFER field in the legacy SMRAM save
      state area, i.e. don't save/restore EFER across SMM transitions.  KVM
      somewhat models this, e.g. doesn't clear EFER on entry to SMM if the
      guest doesn't support long mode.  But during RSM, KVM unconditionally
      clears EFER so that it can get back to pure 32-bit mode in order to
      start loading CRs with their actual non-SMM values.
      
      Clear EFER only when it will be written when loading the non-SMM state
      so as to preserve bits that can theoretically be set on 32-bit vCPUs,
      e.g. KVM always emulates EFER_SCE.
      
      And because CR4.PAE is cleared only to play nice with EFER, wrap that
      code in the long mode check as well.  Note, this may result in a
      compiler warning about cr4 being consumed uninitialized.  Re-read CR4
      even though it's technically unnecessary, as doing so allows for more
      readable code and RSM emulation is not a performance critical path.
      
      Fixes: 660a5d51 ("KVM: x86: save/load state on SMM switch")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bc191940
    • Aurelien Aptel's avatar
      CIFS: keep FileInfo handle live during oplock break · 53fc31a4
      Aurelien Aptel authored
      commit b98749cac4a695f084a5ff076f4510b23e353ecd upstream.
      
      In the oplock break handler, writing pending changes from pages puts
      the FileInfo handle. If the refcount reaches zero it closes the handle
      and waits for any oplock break handler to return, thus causing a deadlock.
      
      To prevent this situation:
      
      * We add a wait flag to cifsFileInfo_put() to decide whether we should
        wait for running/pending oplock break handlers
      
      * We keep an additionnal reference of the SMB FileInfo handle so that
        for the rest of the handler putting the handle won't close it.
        - The ref is bumped everytime we queue the handler via the
          cifs_queue_oplock_break() helper.
        - The ref is decremented at the end of the handler
      
      This bug was triggered by xfstest 464.
      
      Also important fix to address the various reports of
      oops in smb2_push_mandatory_locks
      Signed-off-by: default avatarAurelien Aptel <aaptel@suse.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53fc31a4
    • Matteo Croce's avatar
      net: thunderx: don't allow jumbo frames with XDP · 4a692bc1
      Matteo Croce authored
      [ Upstream commit 1f227d16083b2e280b7dde4ca78883d75593f2fd ]
      
      The thunderx driver forbids to load an eBPF program if the MTU is too high,
      but this can be circumvented by loading the eBPF, then raising the MTU.
      
      Fix this by limiting the MTU if an eBPF program is already loaded.
      
      Fixes: 05c773f5 ("net: thunderx: Add basic XDP support")
      Signed-off-by: default avatarMatteo Croce <mcroce@redhat.com>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4a692bc1
    • Matteo Croce's avatar
      net: thunderx: raise XDP MTU to 1508 · 6d5d30e0
      Matteo Croce authored
      [ Upstream commit 5ee15c101f29e0093ffb5448773ccbc786eb313b ]
      
      The thunderx driver splits frames bigger than 1530 bytes to multiple
      pages, making impossible to run an eBPF program on it.
      This leads to a maximum MTU of 1508 if QinQ is in use.
      
      The thunderx driver forbids to load an eBPF program if the MTU is higher
      than 1500 bytes. Raise the limit to 1508 so it is possible to use L2
      protocols which need some more headroom.
      
      Fixes: 05c773f5 ("net: thunderx: Add basic XDP support")
      Signed-off-by: default avatarMatteo Croce <mcroce@redhat.com>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6d5d30e0
    • Eric Dumazet's avatar
      ipv4: ensure rcu_read_lock() in ipv4_link_failure() · 0a7e8300
      Eric Dumazet authored
      [ Upstream commit c543cb4a5f07e09237ec0fc2c60c9f131b2c79ad ]
      
      fib_compute_spec_dst() needs to be called under rcu protection.
      
      syzbot reported :
      
      WARNING: suspicious RCU usage
      5.1.0-rc4+ #165 Not tainted
      include/linux/inetdevice.h:220 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      1 lock held by swapper/0/0:
       #0: 0000000051b67925 ((&n->timer)){+.-.}, at: lockdep_copy_map include/linux/lockdep.h:170 [inline]
       #0: 0000000051b67925 ((&n->timer)){+.-.}, at: call_timer_fn+0xda/0x720 kernel/time/timer.c:1315
      
      stack backtrace:
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.1.0-rc4+ #165
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5162
       __in_dev_get_rcu include/linux/inetdevice.h:220 [inline]
       fib_compute_spec_dst+0xbbd/0x1030 net/ipv4/fib_frontend.c:294
       spec_dst_fill net/ipv4/ip_options.c:245 [inline]
       __ip_options_compile+0x15a7/0x1a10 net/ipv4/ip_options.c:343
       ipv4_link_failure+0x172/0x400 net/ipv4/route.c:1195
       dst_link_failure include/net/dst.h:427 [inline]
       arp_error_report+0xd1/0x1c0 net/ipv4/arp.c:297
       neigh_invalidate+0x24b/0x570 net/core/neighbour.c:995
       neigh_timer_handler+0xc35/0xf30 net/core/neighbour.c:1081
       call_timer_fn+0x190/0x720 kernel/time/timer.c:1325
       expire_timers kernel/time/timer.c:1362 [inline]
       __run_timers kernel/time/timer.c:1681 [inline]
       __run_timers kernel/time/timer.c:1649 [inline]
       run_timer_softirq+0x652/0x1700 kernel/time/timer.c:1694
       __do_softirq+0x266/0x95a kernel/softirq.c:293
       invoke_softirq kernel/softirq.c:374 [inline]
       irq_exit+0x180/0x1d0 kernel/softirq.c:414
       exiting_irq arch/x86/include/asm/apic.h:536 [inline]
       smp_apic_timer_interrupt+0x14a/0x570 arch/x86/kernel/apic/apic.c:1062
       apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:807
      
      Fixes: ed0de45a1008 ("ipv4: recompile ip options in ipv4_link_failure")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Stephen Suryaputra <ssuryaextr@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0a7e8300
    • Stephen Suryaputra's avatar
      ipv4: recompile ip options in ipv4_link_failure · 3d988fcd
      Stephen Suryaputra authored
      [ Upstream commit ed0de45a1008991fdaa27a0152befcb74d126a8b ]
      
      Recompile IP options since IPCB may not be valid anymore when
      ipv4_link_failure is called from arp_error_report.
      
      Refer to the commit 3da1ed7ac398 ("net: avoid use IPCB in cipso_v4_error")
      and the commit before that (9ef6b42ad6fd) for a similar issue.
      Signed-off-by: default avatarStephen Suryaputra <ssuryaextr@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3d988fcd
    • Jason Wang's avatar
      vhost: reject zero size iova range · f052bafe
      Jason Wang authored
      [ Upstream commit 813dbeb656d6c90266f251d8bd2b02d445afa63f ]
      
      We used to accept zero size iova range which will lead a infinite loop
      in translate_desc(). Fixing this by failing the request in this case.
      
      Reported-by: syzbot+d21e6e297322a900c128@syzkaller.appspotmail.com
      Fixes: 6b1e6cc7 ("vhost: new device IOTLB API")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f052bafe
    • Hangbin Liu's avatar
      team: set slave to promisc if team is already in promisc mode · b205097f
      Hangbin Liu authored
      [ Upstream commit 43c2adb9df7ddd6560fd3546d925b42cef92daa0 ]
      
      After adding a team interface to bridge, the team interface will enter
      promisc mode. Then if we add a new slave to team0, the slave will keep
      promisc off. Fix it by setting slave to promisc on if team master is
      already in promisc mode, also do the same for allmulti.
      
      v2: add promisc and allmulti checking when delete ports
      
      Fixes: 3d249d4c ("net: introduce ethernet teaming device")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b205097f
    • Eric Dumazet's avatar
      tcp: tcp_grow_window() needs to respect tcp_space() · 07b1747f
      Eric Dumazet authored
      [ Upstream commit 50ce163a72d817a99e8974222dcf2886d5deb1ae ]
      
      For some reason, tcp_grow_window() correctly tests if enough room
      is present before attempting to increase tp->rcv_ssthresh,
      but does not prevent it to grow past tcp_space()
      
      This is causing hard to debug issues, like failing
      the (__tcp_select_window(sk) >= tp->rcv_wnd) test
      in __tcp_ack_snd_check(), causing ACK delays and possibly
      slow flows.
      
      Depending on tcp_rmem[2], MTU, skb->len/skb->truesize ratio,
      we can see the problem happening on "netperf -t TCP_RR -- -r 2000,2000"
      after about 60 round trips, when the active side no longer sends
      immediate acks.
      
      This bug predates git history.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarWei Wang <weiwan@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      07b1747f
    • Lorenzo Bianconi's avatar
      net: fou: do not use guehdr after iptunnel_pull_offloads in gue_udp_recv · 8835f1c7
      Lorenzo Bianconi authored
      [ Upstream commit 988dc4a9a3b66be75b30405a5494faf0dc7cffb6 ]
      
      gue tunnels run iptunnel_pull_offloads on received skbs. This can
      determine a possible use-after-free accessing guehdr pointer since
      the packet will be 'uncloned' running pskb_expand_head if it is a
      cloned gso skb (e.g if the packet has been sent though a veth device)
      
      Fixes: a09a4c8d ("tunnels: Remove encapsulation offloads on decap")
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8835f1c7
    • Nikolay Aleksandrov's avatar
      net: bridge: multicast: use rcu to access port list from br_multicast_start_querier · c1e7e01e
      Nikolay Aleksandrov authored
      [ Upstream commit c5b493ce192bd7a4e7bd073b5685aad121eeef82 ]
      
      br_multicast_start_querier() walks over the port list but it can be
      called from a timer with only multicast_lock held which doesn't protect
      the port list, so use RCU to walk over it.
      
      Fixes: c83b8fab ("bridge: Restart queries when last querier expires")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c1e7e01e
    • Nikolay Aleksandrov's avatar
      net: bridge: fix per-port af_packet sockets · a1a9b69e
      Nikolay Aleksandrov authored
      [ Upstream commit 3b2e2904deb314cc77a2192f506f2fd44e3d10d0 ]
      
      When the commit below was introduced it changed two visible things:
       - the skb was no longer passed through the protocol handlers with the
         original device
       - the skb was passed up the stack with skb->dev = bridge
      
      The first change broke af_packet sockets on bridge ports. For example we
      use them for hostapd which listens for ETH_P_PAE packets on the ports.
      We discussed two possible fixes:
       - create a clone and pass it through NF_HOOK(), act on the original skb
         based on the result
       - somehow signal to the caller from the okfn() that it was called,
         meaning the skb is ok to be passed, which this patch is trying to
         implement via returning 1 from the bridge link-local okfn()
      
      Note that we rely on the fact that NF_QUEUE/STOLEN would return 0 and
      drop/error would return < 0 thus the okfn() is called only when the
      return was 1, so we signal to the caller that it was called by preserving
      the return value from nf_hook().
      
      Fixes: 8626c56c ("bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdict")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a1a9b69e
    • Gustavo A. R. Silva's avatar
      net: atm: Fix potential Spectre v1 vulnerabilities · 9fac50c3
      Gustavo A. R. Silva authored
      [ Upstream commit 899537b73557aafbdd11050b501cf54b4f5c45af ]
      
      arg is controlled by user-space, hence leading to a potential
      exploitation of the Spectre variant 1 vulnerability.
      
      This issue was detected with the help of Smatch:
      
      net/atm/lec.c:715 lec_mcast_attach() warn: potential spectre issue 'dev_lec' [r] (local cap)
      
      Fix this by sanitizing arg before using it to index dev_lec.
      
      Notice that given that speculation windows are large, the policy is
      to kill the speculation on the first load and not worry if it can be
      completed with a dependent load/store [1].
      
      [1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9fac50c3
    • Sabrina Dubroca's avatar
      bonding: fix event handling for stacked bonds · a1b3ec23
      Sabrina Dubroca authored
      [ Upstream commit 92480b3977fd3884649d404cbbaf839b70035699 ]
      
      When a bond is enslaved to another bond, bond_netdev_event() only
      handles the event as if the bond is a master, and skips treating the
      bond as a slave.
      
      This leads to a refcount leak on the slave, since we don't remove the
      adjacency to its master and the master holds a reference on the slave.
      
      Reproducer:
        ip link add bondL type bond
        ip link add bondU type bond
        ip link set bondL master bondU
        ip link del bondL
      
      No "Fixes:" tag, this code is older than git history.
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a1b3ec23