1. 10 Jul, 2019 1 commit
  2. 11 Jun, 2019 1 commit
    • Dan Carpenter's avatar
      test_firmware: Use correct snprintf() limit · 2418da02
      Dan Carpenter authored
      commit bd17cc5a upstream.
      The limit here is supposed to be how much of the page is left, but it's
      just using PAGE_SIZE as the limit.
      The other thing to remember is that snprintf() returns the number of
      bytes which would have been copied if we had had enough room.  So that
      means that if we run out of space then this code would end up passing a
      negative value as the limit and the kernel would print an error message.
      I have change the code to use scnprintf() which returns the number of
      bytes that were successfully printed (not counting the NUL terminator).
      Fixes: c92316bf ("test_firmware: add batched firmware tests")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
  3. 31 May, 2019 3 commits
  4. 25 May, 2019 1 commit
    • Gary Hook's avatar
      x86/mm/mem_encrypt: Disable all instrumentation for early SME setup · 489e5d8c
      Gary Hook authored
      [ Upstream commit b51ce374 ]
      Enablement of AMD's Secure Memory Encryption feature is determined very
      early after start_kernel() is entered. Part of this procedure involves
      scanning the command line for the parameter 'mem_encrypt'.
      To determine intended state, the function sme_enable() uses library
      functions cmdline_find_option() and strncmp(). Their use occurs early
      enough such that it cannot be assumed that any instrumentation subsystem
      is initialized.
      For example, making calls to a KASAN-instrumented function before KASAN
      is set up will result in the use of uninitialized memory and a boot
      When AMD's SME support is enabled, conditionally disable instrumentation
      of these dependent functions in lib/string.c and arch/x86/lib/cmdline.c.
       [ bp: Get rid of intermediary nostackp var and cleanup whitespace. ]
      Fixes: aca20d54 ("x86/mm: Add support to make use of Secure Memory Encryption")
      Reported-by: default avatarLi RongQing <lirongqing@baidu.com>
      Signed-off-by: default avatarGary R Hook <gary.hook@amd.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Boris Brezillon <bbrezillon@kernel.org>
      Cc: Coly Li <colyli@suse.de>
      Cc: "dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Cc: "luto@kernel.org" <luto@kernel.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: "mingo@redhat.com" <mingo@redhat.com>
      Cc: "peterz@infradead.org" <peterz@infradead.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/155657657552.7116.18363762932464011367.stgit@sosrh3.amd.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
  5. 21 May, 2019 1 commit
  6. 10 May, 2019 1 commit
  7. 08 May, 2019 2 commits
  8. 02 May, 2019 1 commit
  9. 20 Apr, 2019 1 commit
  10. 17 Apr, 2019 1 commit
  11. 05 Apr, 2019 3 commits
    • Nathan Chancellor's avatar
      ARM: 8833/1: Ensure that NEON code always compiles with Clang · 416b593a
      Nathan Chancellor authored
      [ Upstream commit de9c0d49 ]
      While building arm32 allyesconfig, I ran into the following errors:
        arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with
        '-mfloat-abi=softfp -mfpu=neon'
        In file included from lib/raid6/neon1.c:27:
        error: "NEON support not enabled"
      Building V=1 showed NEON_FLAGS getting passed along to Clang but
      __ARM_NEON__ was not getting defined. Ultimately, it boils down to Clang
      only defining __ARM_NEON__ when targeting armv7, rather than armv6k,
      which is the '-march' value for allyesconfig.
      >From lib/Basic/Targets/ARM.cpp in the Clang source:
        // This only gets set when Neon instructions are actually available, unlike
        // the VFP define, hence the soft float and arch check. This is subtly
        // different from gcc, we follow the intent which was that it should be set
        // when Neon instructions are actually available.
        if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) {
          Builder.defineMacro("__ARM_NEON", "1");
          // current AArch32 NEON implementations do not support double-precision
          // floating-point even when it is present in VFP.
                              "0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP));
      Ard Biesheuvel recommended explicitly adding '-march=armv7-a' at the
      beginning of the NEON_FLAGS definitions so that __ARM_NEON__ always gets
      definined by Clang. This doesn't functionally change anything because
      that code will only run where NEON is supported, which is implicitly
      Link: https://github.com/ClangBuiltLinux/linux/issues/287Suggested-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Acked-by: default avatarNicolas Pitre <nico@linaro.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Reviewed-by: default avatarStefan Agner <stefan@agner.ch>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
    • Andrea Righi's avatar
      kprobes: Prohibit probing on bsearch() · e62824d1
      Andrea Righi authored
      [ Upstream commit 02106f88 ]
      Since kprobe breakpoing handler is using bsearch(), probing on this
      routine can cause recursive breakpoint problem.
               ->bsearch() -> int3
      Prohibit probing on bsearch().
      Signed-off-by: default avatarAndrea Righi <righi.andrea@gmail.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/154998813406.31052.8791425358974650922.stgit@devboxSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
    • Peter Zijlstra's avatar
      lib/int_sqrt: optimize initial value compute · 083aa6a5
      Peter Zijlstra authored
      commit f8ae107e upstream.
      The initial value (@m) compute is:
      	m = 1UL << (BITS_PER_LONG - 2);
      	while (m > x)
      		m >>= 2;
      Which is a linear search for the highest even bit smaller or equal to @x
      We can implement this using a binary search using __fls() (or better when
      its hardware implemented).
      	m = 1UL << (__fls(x) & ~1UL);
      Especially for small values of @x; which are the more common arguments
      when doing a CDF on idle times; the linear search is near to worst case,
      while the binary search of __fls() is a constant 6 (or 5 on 32bit)
            cycles:                 branches:              branch-misses:
      hot:   43.633557 +- 0.034373  45.333132 +- 0.002277  0.023529 +- 0.000681
      cold: 207.438411 +- 0.125840  45.333132 +- 0.002277  6.976486 +- 0.004219
      hot:   29.576176 +- 0.028850  26.666730 +- 0.004511  0.019463 +- 0.000663
      cold: 165.947136 +- 0.188406  26.666746 +- 0.004511  6.133897 +- 0.004386
      hot:   24.720922 +- 0.025161  20.666784 +- 0.004509  0.020836 +- 0.000677
      cold: 132.777197 +- 0.127471  20.666776 +- 0.004509  5.080285 +- 0.003874
      Averages computed over all values <128k using a LFSR to generate order.
      Cold numbers have a LFSR based branch trace buffer 'confuser' ran between
      each int_sqrt() invocation.
      Link: http://lkml.kernel.org/r/20171020164644.936577234@infradead.orgSigned-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Suggested-by: default avatarJoe Perches <joe@perches.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Anshul Garg <aksgarg1989@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Michael Davidson <md@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
  12. 03 Apr, 2019 1 commit
  13. 27 Mar, 2019 1 commit
    • Peter Zijlstra's avatar
      lib/int_sqrt: optimize small argument · aab86217
      Peter Zijlstra authored
      commit 3f329570 upstream.
      The current int_sqrt() computation is sub-optimal for the case of small
      @x.  Which is the interesting case when we're going to do cumulative
      distribution functions on idle times, which we assume to be a random
      variable, where the target residency of the deepest idle state gives an
      upper bound on the variable (5e6ns on recent Intel chips).
      In the case of small @x, the compute loop:
      	while (m != 0) {
      		b = y + m;
      		y >>= 1;
      		if (x >= b) {
      			x -= b;
      			y += m;
      		m >>= 2;
      can be reduced to:
      	while (m > x)
      		m >>= 2;
      Because y==0, b==m and until x>=m y will remain 0.
      And while this is computationally equivalent, it runs much faster
      because there's less code, in particular less branches.
            cycles:                 branches:              branch-misses:
      hot:   45.109444 +- 0.044117  44.333392 +- 0.002254  0.018723 +- 0.000593
      cold: 187.737379 +- 0.156678  44.333407 +- 0.002254  6.272844 +- 0.004305
      hot:   67.937492 +- 0.064124  66.999535 +- 0.000488  0.066720 +- 0.001113
      cold: 232.004379 +- 0.332811  66.999527 +- 0.000488  6.914634 +- 0.006568
      hot:   43.633557 +- 0.034373  45.333132 +- 0.002277  0.023529 +- 0.000681
      cold: 207.438411 +- 0.125840  45.333132 +- 0.002277  6.976486 +- 0.004219
      Averages computed over all values <128k using a LFSR to generate order.
      Cold numbers have a LFSR based branch trace buffer 'confuser' ran between
      each int_sqrt() invocation.
      Link: http://lkml.kernel.org/r/20171020164644.876503355@infradead.org
      Fixes: 30493cc9 ("lib/int_sqrt.c: optimize square root algorithm")
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Suggested-by: default avatarAnshul Garg <aksgarg1989@gmail.com>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Michael Davidson <md@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
  14. 23 Mar, 2019 1 commit
    • David Howells's avatar
      assoc_array: Fix shortcut creation · d366f513
      David Howells authored
      [ Upstream commit bb2ba2d7 ]
      Fix the creation of shortcuts for which the length of the index key value
      is an exact multiple of the machine word size.  The problem is that the
      code that blanks off the unused bits of the shortcut value malfunctions if
      the number of bits in the last word equals machine word size.  This is due
      to the "<<" operator being given a shift of zero in this case, and so the
      mask that should be all zeros is all ones instead.  This causes the
      subsequent masking operation to clear everything rather than clearing
      Ordinarily, the presence of the hash at the beginning of the tree index key
      makes the issue very hard to test for, but in this case, it was encountered
      due to a development mistake that caused the hash output to be either 0
      (keyring) or 1 (non-keyring) only.  This made it susceptible to the
      keyctl/unlink/valid test in the keyutils package.
      The fix is simply to skip the blanking if the shift would be 0.  For
      example, an index key that is 64 bits long would produce a 0 shift and thus
      a 'blank' of all 1s.  This would then be inverted and AND'd onto the
      index_key, incorrectly clearing the entire last word.
      Fixes: 3cb98950 ("Add a generic associative array implementation.")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarJames Morris <james.morris@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
  15. 13 Mar, 2019 1 commit
  16. 12 Feb, 2019 1 commit
  17. 13 Jan, 2019 2 commits
  18. 17 Dec, 2018 2 commits
  19. 13 Dec, 2018 2 commits
  20. 08 Dec, 2018 2 commits
  21. 05 Dec, 2018 1 commit
  22. 27 Nov, 2018 1 commit
  23. 21 Nov, 2018 1 commit
  24. 13 Nov, 2018 1 commit
    • Waiman Long's avatar
      locking/lockdep: Fix debug_locks off performance problem · 5cf2ab06
      Waiman Long authored
      [ Upstream commit 9506a742 ]
      It was found that when debug_locks was turned off because of a problem
      found by the lockdep code, the system performance could drop quite
      significantly when the lock_stat code was also configured into the
      kernel. For instance, parallel kernel build time on a 4-socket x86-64
      server nearly doubled.
      Further analysis into the cause of the slowdown traced back to the
      frequent call to debug_locks_off() from the __lock_acquired() function
      probably due to some inconsistent lockdep states with debug_locks
      off. The debug_locks_off() function did an unconditional atomic xchg
      to write a 0 value into debug_locks which had already been set to 0.
      This led to severe cacheline contention in the cacheline that held
      debug_locks.  As debug_locks is being referenced in quite a few different
      places in the kernel, this greatly slow down the system performance.
      To prevent that trashing of debug_locks cacheline, lock_acquired()
      and lock_contended() now checks the state of debug_locks before
      proceeding. The debug_locks_off() function is also modified to check
      debug_locks before calling __debug_locks_off().
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: http://lkml.kernel.org/r/1539913518-15598-1-git-send-email-longman@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
  25. 04 Nov, 2018 1 commit
  26. 04 Oct, 2018 1 commit
    • Bart Van Assche's avatar
      scsi: klist: Make it safe to use klists in atomic context · 1390c37d
      Bart Van Assche authored
      [ Upstream commit 624fa779 ]
      In the scsi_transport_srp implementation it cannot be avoided to
      iterate over a klist from atomic context when using the legacy block
      layer instead of blk-mq. Hence this patch that makes it safe to use
      klists in atomic context. This patch avoids that lockdep reports the
      WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
       Possible interrupt unsafe locking scenario:
             CPU0                    CPU1
             ----                    ----
      stack backtrace:
      Workqueue: kblockd blk_timeout_work
      Call Trace:
       srp_timed_out+0xaf/0x1d0 [scsi_transport_srp]
       scsi_times_out+0xd4/0x410 [scsi_mod]
      See also commit c9ddf734 ("scsi: scsi_transport_srp: Fix shost to
      rport translation").
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: James Bottomley <jejb@linux.vnet.ibm.com>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
  27. 23 Sep, 2018 5 commits
    • Philippe Gerum's avatar
      mm: ipipe: disable ondemand memory · 8168e4dd
      Philippe Gerum authored
      Co-kernels cannot bear with the extra latency caused by memory access
      faults involved in COW or
      overcommit. __ipipe_disable_ondemand_mappings() force commits all
      common memory mappings with physical RAM.
      In addition, the architecture code is given a chance to pre-load page
      table entries for ioremap and vmalloc memory, for preventing further
      minor faults accessing such memory due to PTE misses (if that ever
      makes sense for them).
      Revisit: Further COW breaking in copy_user_page() and copy_pte_range()
      may be useless once __ipipe_disable_ondemand_mappings() has run for a
      co-kernel task, since all of its mappings have been populated, and
      unCOWed if applicable.
    • Philippe Gerum's avatar
      dump_stack: ipipe: make dump_stack() domain-aware · 16d710c9
      Philippe Gerum authored
      When dumping a stack backtrace, we neither need nor want to disable
      root stage IRQs over the head stage, where CPU migration can't
      Conversely, we neither need nor want to disable hard IRQs from the
      head stage, so that latency won't skyrocket either.
    • Philippe Gerum's avatar
      lib/smp_processor_id: ipipe: exclude head domain from preemption check · 0853aafd
      Philippe Gerum authored
      There can be no CPU migration from the head stage, however the
      out-of-band code currently running smp_processor_id() might have
      preempted the regular kernel code from within a preemptible section,
      which might cause false positive in the end.
      These are the two reasons why we certainly neither need nor want to do
      the preemption check in that case.
    • Philippe Gerum's avatar
      atomic: ipipe: keep atomic when pipelining IRQs · 6d79de03
      Philippe Gerum authored
      Because of the virtualization of interrupt masking for the regular
      kernel code when the pipeline is enabled, atomic helpers relying on
      common interrupt disabling helpers such as local_irq_save/restore
      pairs would not be atomic anymore, leading to data corruption.
      This commit restores true atomicity for the atomic helpers that would
      be otherwise affected by interrupt virtualization.
    • Philippe Gerum's avatar
      locking: ipipe: add hard lock alternative to regular spinlocks · 659139c0
      Philippe Gerum authored
      Hard spinlocks manipulate the CPU interrupt mask, without affecting
      the kernel preemption state in locking/unlocking operations.
      This type of spinlock is useful for implementing a critical section to
      serialize concurrent accesses from both in-band and out-of-band
      contexts, i.e. from root and head stages.
      Hard spinlocks exclusively depend on the pre-existing arch-specific
      bits which implement regular spinlocks. They can be seen as basic
      spinlocks still affecting the CPU's interrupt state when all other
      spinlock types only deal with the virtual interrupt flag managed by
      the pipeline core - i.e. only disable interrupts for the regular
      in-band kernel activity.