1. 22 Jun, 2017 1 commit
    • Kees Cook's avatar
      gcc-plugins: Add the randstruct plugin · 313dd1b6
      Kees Cook authored
      This randstruct plugin is modified from Brad Spengler/PaX Team's code
      in the last public patch of grsecurity/PaX based on my understanding
      of the code. Changes or omissions from the original code are mine and
      don't reflect the original grsecurity/PaX code.
      
      The randstruct GCC plugin randomizes the layout of selected structures
      at compile time, as a probabilistic defense against attacks that need to
      know the layout of structures within the kernel. This is most useful for
      "in-house" kernel builds where neither the randomization seed nor other
      build artifacts are made available to an attacker. While less useful for
      distribution kernels (where the randomization seed must be exposed for
      third party kernel module builds), it still has some value there since now
      all kernel builds would need to be tracked by an attacker.
      
      In more performance sensitive scenarios, GCC_PLUGIN_RANDSTRUCT_PERFORMANCE
      can be selected to make a best effort to restrict randomization to
      cacheline-sized groups of elements, and will not randomize bitfields. This
      comes at the cost of reduced randomization.
      
      Two annotations are defined,__randomize_layout and __no_randomize_layout,
      which respectively tell the plugin to either randomize or not to
      randomize instances of the struct in question. Follow-on patches enable
      the auto-detection logic for selecting structures for randomization
      that contain only function pointers. It is disabled here to assist with
      bisection.
      
      Since any randomized structs must be initialized using designated
      initializers, __randomize_layout includes the __designated_init annotation
      even when the plugin is disabled so that all builds will require
      the needed initialization. (With the plugin enabled, annotations for
      automatically chosen structures are marked as well.)
      
      The main differences between this implemenation and grsecurity are:
      - disable automatic struct selection (to be enabled in follow-up patch)
      - add designated_init attribute at runtime and for manual marking
      - clarify debugging output to differentiate bad cast warnings
      - add whitelisting infrastructure
      - support gcc 7's DECL_ALIGN and DECL_MODE changes (Laura Abbott)
      - raise minimum required GCC version to 4.7
      
      Earlier versions of this patch series were ported by Michael Leibowitz.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      313dd1b6
  2. 09 May, 2017 1 commit
    • Hari Bathini's avatar
      crash: move crashkernel parsing and vmcore related code under CONFIG_CRASH_CORE · 692f66f2
      Hari Bathini authored
      Patch series "kexec/fadump: remove dependency with CONFIG_KEXEC and
      reuse crashkernel parameter for fadump", v4.
      
      Traditionally, kdump is used to save vmcore in case of a crash.  Some
      architectures like powerpc can save vmcore using architecture specific
      support instead of kexec/kdump mechanism.  Such architecture specific
      support also needs to reserve memory, to be used by dump capture kernel.
      crashkernel parameter can be a reused, for memory reservation, by such
      architecture specific infrastructure.
      
      This patchset removes dependency with CONFIG_KEXEC for crashkernel
      parameter and vmcoreinfo related code as it can be reused without kexec
      support.  Also, crashkernel parameter is reused instead of
      fadump_reserve_mem to reserve memory for fadump.
      
      The first patch moves crashkernel parameter parsing and vmcoreinfo
      related code under CONFIG_CRASH_CORE instead of CONFIG_KEXEC_CORE.  The
      second patch reuses the definitions of append_elf_note() & final_note()
      functions under CONFIG_CRASH_CORE in IA64 arch code.  The third patch
      removes dependency on CONFIG_KEXEC for firmware-assisted dump (fadump)
      in powerpc.  The next patch reuses crashkernel parameter for reserving
      memory for fadump, instead of the fadump_reserve_mem parameter.  This
      has the advantage of using all syntaxes crashkernel parameter supports,
      for fadump as well.  The last patch updates fadump kernel documentation
      about use of crashkernel parameter.
      
      This patch (of 5):
      
      Traditionally, kdump is used to save vmcore in case of a crash.  Some
      architectures like powerpc can save vmcore using architecture specific
      support instead of kexec/kdump mechanism.  Such architecture specific
      support also needs to reserve memory, to be used by dump capture kernel.
      crashkernel parameter can be a reused, for memory reservation, by such
      architecture specific infrastructure.
      
      But currently, code related to vmcoreinfo and parsing of crashkernel
      parameter is built under CONFIG_KEXEC_CORE.  This patch introduces
      CONFIG_CRASH_CORE and moves the above mentioned code under this config,
      allowing code reuse without dependency on CONFIG_KEXEC.  There is no
      functional change with this patch.
      
      Link: http://lkml.kernel.org/r/149035338104.6881.4550894432615189948.stgit@hbathini.in.ibm.comSigned-off-by: default avatarHari Bathini <hbathini@linux.vnet.ibm.com>
      Acked-by: default avatarDave Young <dyoung@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      692f66f2
  3. 26 Apr, 2017 1 commit
  4. 18 Apr, 2017 1 commit
  5. 28 Mar, 2017 1 commit
  6. 13 Mar, 2017 1 commit
    • Dmitry Safonov's avatar
      x86/mm: Introduce mmap_compat_base() for 32-bit mmap() · 1b028f78
      Dmitry Safonov authored
      mmap() uses a base address, from which it starts to look for a free space
      for allocation.
      
      The base address is stored in mm->mmap_base, which is calculated during
      exec(). The address depends on task's size, set rlimit for stack, ASLR
      randomization. The base depends on the task size and the number of random
      bits which are different for 64-bit and 32bit applications.
      
      Due to the fact, that the base address is fixed, its mmap() from a compat
      (32bit) syscall issued by a 64bit task will return a address which is based
      on the 64bit base address and does not fit into the 32bit address space
      (4GB). The returned pointer is truncated to 32bit, which results in an
      invalid address.
      
      To solve store a seperate compat address base plus a compat legacy address
      base in mm_struct. These bases are calculated at exec() time and can be
      used later to address the 32bit compat mmap() issued by 64 bit
      applications.
      
      As a consequence of this change 32-bit applications issuing a 64-bit
      syscall (after doing a long jump) will get a 64-bit mapping now. Before
      this change 32-bit applications always got a 32bit mapping.
      
      [ tglx: Massaged changelog and added a comment ]
      Signed-off-by: default avatarDmitry Safonov <dsafonov@virtuozzo.com>
      Cc: 0x7f454c46@gmail.com
      Cc: linux-mm@kvack.org
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Link: http://lkml.kernel.org/r/20170306141721.9188-4-dsafonov@virtuozzo.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      1b028f78
  7. 08 Mar, 2017 1 commit
    • Josh Poimboeuf's avatar
      stacktrace/x86: add function for detecting reliable stack traces · af085d90
      Josh Poimboeuf authored
      For live patching and possibly other use cases, a stack trace is only
      useful if it can be assured that it's completely reliable.  Add a new
      save_stack_trace_tsk_reliable() function to achieve that.
      
      Note that if the target task isn't the current task, and the target task
      is allowed to run, then it could be writing the stack while the unwinder
      is reading it, resulting in possible corruption.  So the caller of
      save_stack_trace_tsk_reliable() must ensure that the task is either
      'current' or inactive.
      
      save_stack_trace_tsk_reliable() relies on the x86 unwinder's detection
      of pt_regs on the stack.  If the pt_regs are not user-mode registers
      from a syscall, then they indicate an in-kernel interrupt or exception
      (e.g. preemption or a page fault), in which case the stack is considered
      unreliable due to the nature of frame pointers.
      
      It also relies on the x86 unwinder's detection of other issues, such as:
      
      - corrupted stack data
      - stack grows the wrong way
      - stack walk doesn't reach the bottom
      - user didn't provide a large enough entries array
      
      Such issues are reported by checking unwind_error() and !unwind_done().
      
      Also add CONFIG_HAVE_RELIABLE_STACKTRACE so arch-independent code can
      determine at build time whether the function is implemented.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Acked-by: Ingo Molnar <mingo@kernel.org>	# for the x86 changes
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      af085d90
  8. 28 Feb, 2017 1 commit
  9. 25 Feb, 2017 1 commit
  10. 21 Feb, 2017 1 commit
    • Daniel Borkmann's avatar
      arch: add ARCH_HAS_SET_MEMORY config · d2852a22
      Daniel Borkmann authored
      Currently, there's no good way to test for the presence of
      set_memory_ro/rw/x/nx() helpers implemented by archs such as
      x86, arm, arm64 and s390.
      
      There's DEBUG_SET_MODULE_RONX and DEBUG_RODATA, however both
      don't really reflect that: set_memory_*() are also available
      even when DEBUG_SET_MODULE_RONX is turned off, and DEBUG_RODATA
      is set by parisc, but doesn't implement above functions. Thus,
      add ARCH_HAS_SET_MEMORY that is selected by mentioned archs,
      where generic code can test against this.
      
      This also allows later on to move DEBUG_SET_MODULE_RONX out of
      the arch specific Kconfig to define it only once depending on
      ARCH_HAS_SET_MEMORY.
      Suggested-by: default avatarLaura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2852a22
  11. 07 Feb, 2017 2 commits
  12. 18 Jan, 2017 2 commits
  13. 20 Dec, 2016 1 commit
    • Thiago Jung Bauermann's avatar
      powerpc: ima: get the kexec buffer passed by the previous kernel · 467d2782
      Thiago Jung Bauermann authored
      Patch series "ima: carry the measurement list across kexec", v8.
      
      The TPM PCRs are only reset on a hard reboot.  In order to validate a
      TPM's quote after a soft reboot (eg.  kexec -e), the IMA measurement
      list of the running kernel must be saved and then restored on the
      subsequent boot, possibly of a different architecture.
      
      The existing securityfs binary_runtime_measurements file conveniently
      provides a serialized format of the IMA measurement list.  This patch
      set serializes the measurement list in this format and restores it.
      
      Up to now, the binary_runtime_measurements was defined as architecture
      native format.  The assumption being that userspace could and would
      handle any architecture conversions.  With the ability of carrying the
      measurement list across kexec, possibly from one architecture to a
      different one, the per boot architecture information is lost and with it
      the ability of recalculating the template digest hash.  To resolve this
      problem, without breaking the existing ABI, this patch set introduces
      the boot command line option "ima_canonical_fmt", which is arbitrarily
      defined as little endian.
      
      The need for this boot command line option will be limited to the
      existing version 1 format of the binary_runtime_measurements.
      Subsequent formats will be defined as canonical format (eg.  TPM 2.0
      support for larger digests).
      
      A simplified method of Thiago Bauermann's "kexec buffer handover" patch
      series for carrying the IMA measurement list across kexec is included in
      this patch set.  The simplified method requires all file measurements be
      taken prior to executing the kexec load, as subsequent measurements will
      not be carried across the kexec and restored.
      
      This patch (of 10):
      
      The IMA kexec buffer allows the currently running kernel to pass the
      measurement list via a kexec segment to the kernel that will be kexec'd.
      The second kernel can check whether the previous kernel sent the buffer
      and retrieve it.
      
      This is the architecture-specific part which enables IMA to receive the
      measurement list passed by the previous kernel.  It will be used in the
      next patch.
      
      The change in machine_kexec_64.c is to factor out the logic of removing
      an FDT memory reservation so that it can be used by remove_ima_buffer.
      
      Link: http://lkml.kernel.org/r/1480554346-29071-2-git-send-email-zohar@linux.vnet.ibm.comSigned-off-by: default avatarThiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
      Signed-off-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Andreas Steffen <andreas.steffen@strongswan.org>
      Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
      Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      467d2782
  14. 12 Dec, 2016 1 commit
  15. 15 Nov, 2016 1 commit
  16. 09 Nov, 2016 1 commit
    • Kees Cook's avatar
      gcc-plugins: Adjust Kconfig to avoid cyc_complexity · 215e2aa6
      Kees Cook authored
      In preparation for removing "depends on !COMPILE_TEST" from GCC_PLUGINS,
      the GCC_PLUGIN_CYC_COMPLEXITY plugin needs to gain the restriction,
      since it is mainly an example, and produces (intended) voluminous stderr
      reporting, which is generally undesirable for allyesconfig-style build
      tests. This additionally puts the plugin behind EXPERT and improves the
      help text.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      215e2aa6
  17. 10 Oct, 2016 1 commit
    • Emese Revfy's avatar
      gcc-plugins: Add latent_entropy plugin · 38addce8
      Emese Revfy authored
      This adds a new gcc plugin named "latent_entropy". It is designed to
      extract as much possible uncertainty from a running system at boot time as
      possible, hoping to capitalize on any possible variation in CPU operation
      (due to runtime data differences, hardware differences, SMP ordering,
      thermal timing variation, cache behavior, etc).
      
      At the very least, this plugin is a much more comprehensive example for
      how to manipulate kernel code using the gcc plugin internals.
      
      The need for very-early boot entropy tends to be very architecture or
      system design specific, so this plugin is more suited for those sorts
      of special cases. The existing kernel RNG already attempts to extract
      entropy from reliable runtime variation, but this plugin takes the idea to
      a logical extreme by permuting a global variable based on any variation
      in code execution (e.g. a different value (and permutation function)
      is used to permute the global based on loop count, case statement,
      if/then/else branching, etc).
      
      To do this, the plugin starts by inserting a local variable in every
      marked function. The plugin then adds logic so that the value of this
      variable is modified by randomly chosen operations (add, xor and rol) and
      random values (gcc generates separate static values for each location at
      compile time and also injects the stack pointer at runtime). The resulting
      value depends on the control flow path (e.g., loops and branches taken).
      
      Before the function returns, the plugin mixes this local variable into
      the latent_entropy global variable. The value of this global variable
      is added to the kernel entropy pool in do_one_initcall() and _do_fork(),
      though it does not credit any bytes of entropy to the pool; the contents
      of the global are just used to mix the pool.
      
      Additionally, the plugin can pre-initialize arrays with build-time
      random contents, so that two different kernel builds running on identical
      hardware will not have the same starting values.
      Signed-off-by: default avatarEmese Revfy <re.emese@gmail.com>
      [kees: expanded commit message and code comments]
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      38addce8
  18. 22 Sep, 2016 1 commit
  19. 09 Sep, 2016 2 commits
    • Nicholas Piggin's avatar
      kbuild: allow archs to select link dead code/data elimination · b67067f1
      Nicholas Piggin authored
      Introduce LD_DEAD_CODE_DATA_ELIMINATION option for architectures to
      select to build with -ffunction-sections, -fdata-sections, and link
      with --gc-sections. It requires some work (documented) to ensure all
      unreferenced entrypoints are live, and requires toolchain and build
      verification, so it is made a per-arch option for now.
      
      On a random powerpc64le build, this yelds a significant size saving,
      it boots and runs fine, but there is a lot I haven't tested as yet, so
      these savings may be reduced if there are bugs in the link.
      
          text      data        bss        dec   filename
      11169741   1180744    1923176	14273661   vmlinux
      10445269   1004127    1919707	13369103   vmlinux.dce
      
      ~700K text, ~170K data, 6% removed from kernel image size.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      b67067f1
    • Stephen Rothwell's avatar
      kbuild: allow architectures to use thin archives instead of ld -r · a5967db9
      Stephen Rothwell authored
      ld -r is an incremental link used to create built-in.o files in build
      subdirectories. It produces relocatable object files containing all
      its input files, and these are are then pulled together and relocated
      in the final link. Aside from the bloat, this constrains the final
      link relocations, which has bitten large powerpc builds with
      unresolvable relocations in the final link.
      
      Alan Modra has recommended the kernel use thin archives for linking.
      This is an alternative and means that the linker has more information
      available to it when it links the kernel.
      
      This patch enables a config option architectures can select, which
      causes all built-in.o files to be built as thin archives. built-in.o
      files in subdirectories do not get symbol table or index attached,
      which improves speed and size. The final link pass creates a
      built-in.o archive in the root output directory which includes the
      symbol table and index. The linker then uses takes this file to link.
      
      The --whole-archive linker option is required, because the linker now
      has visibility to every individual object file, and it will otherwise
      just completely avoid including those without external references
      (consider a file with EXPORT_SYMBOL or initcall or hardware exceptions
      as its only entry points). The traditional built works "by luck" as
      built-in.o files are large enough that they're going to get external
      references. However this optimisation is unpredictable for the kernel
      (due to above external references), ineffective at culling unused, and
      costly because the .o files have to be searched for references.
      Superior alternatives for link-time culling should be used instead.
      
      Build characteristics for inclink vs thinarc, on a small powerpc64le
      pseries VM with a modest .config:
      
                                        inclink       thinarc
      sizes
      vmlinux                        15 618 680    15 625 028
      sum of all built-in.o          56 091 808     1 054 334
      sum excluding root built-in.o                   151 430
      
      find -name built-in.o | xargs rm ; time make vmlinux
      real                              22.772s       21.143s
      user                              13.280s       13.430s
      sys                                4.310s        2.750s
      
      - Final kernel pulled in only about 6K more, which shows how
        ineffective the object file culling is.
      - Build performance looks improved due to less pagecache activity.
        On IO constrained systems it could be a bigger win.
      - Build size saving is significant.
      
      Side note, the toochain understands archives, so there's some tricks,
      $ ar t built-in.o          # list all files you linked with
      $ size built-in.o          # and their sizes
      $ objdump -d built-in.o    # disassembly (unrelocated) with filenames
      
      Implementation by sfr, minor tweaks by npiggin.
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      a5967db9
  20. 07 Sep, 2016 1 commit
  21. 24 Aug, 2016 1 commit
  22. 09 Aug, 2016 1 commit
  23. 26 Jul, 2016 2 commits
  24. 24 Jun, 2016 1 commit
    • Linus Torvalds's avatar
      Clarify naming of thread info/stack allocators · b235beea
      Linus Torvalds authored
      We've had the thread info allocated together with the thread stack for
      most architectures for a long time (since the thread_info was split off
      from the task struct), but that is about to change.
      
      But the patches that move the thread info to be off-stack (and a part of
      the task struct instead) made it clear how confused the allocator and
      freeing functions are.
      
      Because the common case was that we share an allocation with the thread
      stack and the thread_info, the two pointers were identical.  That
      identity then meant that we would have things like
      
      	ti = alloc_thread_info_node(tsk, node);
      	...
      	tsk->stack = ti;
      
      which certainly _worked_ (since stack and thread_info have the same
      value), but is rather confusing: why are we assigning a thread_info to
      the stack? And if we move the thread_info away, the "confusing" code
      just gets to be entirely bogus.
      
      So remove all this confusion, and make it clear that we are doing the
      stack allocation by renaming and clarifying the function names to be
      about the stack.  The fact that the thread_info then shares the
      allocation is an implementation detail, and not really about the
      allocation itself.
      
      This is a pure renaming and type fix: we pass in the same pointer, it's
      just that we clarify what the pointer means.
      
      The ia64 code that actually only has one single allocation (for all of
      task_struct, thread_info and kernel thread stack) now looks a bit odd,
      but since "tsk->stack" is actually not even used there, that oddity
      doesn't matter.  It would be a separate thing to clean that up, I
      intentionally left the ia64 changes as a pure brute-force renaming and
      type change.
      Acked-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b235beea
  25. 18 Jun, 2016 1 commit
    • William Breathitt Gray's avatar
      isa: Allow ISA-style drivers on modern systems · 3a495511
      William Breathitt Gray authored
      Several modern devices, such as PC/104 cards, are expected to run on
      modern systems via an ISA bus interface. Since ISA is a legacy interface
      for most modern architectures, ISA support should remain disabled in
      general. Support for ISA-style drivers should be enabled on a per driver
      basis.
      
      To allow ISA-style drivers on modern systems, this patch introduces the
      ISA_BUS_API and ISA_BUS Kconfig options. The ISA bus driver will now
      build conditionally on the ISA_BUS_API Kconfig option, which defaults to
      the legacy ISA Kconfig option. The ISA_BUS Kconfig option allows the
      ISA_BUS_API Kconfig option to be selected on architectures which do not
      enable ISA (e.g. X86_64).
      
      The ISA_BUS Kconfig option is currently only implemented for X86
      architectures. Other architectures may have their own ISA_BUS Kconfig
      options added as required.
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWilliam Breathitt Gray <vilhelm.gray@gmail.com>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3a495511
  26. 07 Jun, 2016 3 commits
    • Emese Revfy's avatar
      Add sancov plugin · 543c37cb
      Emese Revfy authored
      The sancov gcc plugin inserts a __sanitizer_cov_trace_pc() call
      at the start of basic blocks.
      
      This plugin is a helper plugin for the kcov feature. It supports
      all gcc versions with plugin support (from gcc-4.5 on).
      It is based on the gcc commit "Add fuzzing coverage support" by Dmitry Vyukov
      (https://gcc.gnu.org/viewcvs/gcc?limit_changes=0&view=revision&revision=231296).
      Signed-off-by: default avatarEmese Revfy <re.emese@gmail.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      543c37cb
    • Emese Revfy's avatar
      Add Cyclomatic complexity GCC plugin · 0dae776c
      Emese Revfy authored
      Add a very simple plugin to demonstrate the GCC plugin infrastructure. This GCC
      plugin computes the cyclomatic complexity of each function.
      
      The complexity M of a function's control flow graph is defined as:
      M = E - N + 2P
      where
      E = the number of edges
      N = the number of nodes
      P = the number of connected components (exit nodes).
      Signed-off-by: default avatarEmese Revfy <re.emese@gmail.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      0dae776c
    • Emese Revfy's avatar
      GCC plugin infrastructure · 6b90bd4b
      Emese Revfy authored
      This patch allows to build the whole kernel with GCC plugins. It was ported from
      grsecurity/PaX. The infrastructure supports building out-of-tree modules and
      building in a separate directory. Cross-compilation is supported too.
      Currently the x86, arm, arm64 and uml architectures enable plugins.
      
      The directory of the gcc plugins is scripts/gcc-plugins. You can use a file or a directory
      there. The plugins compile with these options:
       * -fno-rtti: gcc is compiled with this option so the plugins must use it too
       * -fno-exceptions: this is inherited from gcc too
       * -fasynchronous-unwind-tables: this is inherited from gcc too
       * -ggdb: it is useful for debugging a plugin (better backtrace on internal
          errors)
       * -Wno-narrowing: to suppress warnings from gcc headers (ipa-utils.h)
       * -Wno-unused-variable: to suppress warnings from gcc headers (gcc_version
          variable, plugin-version.h)
      
      The infrastructure introduces a new Makefile target called gcc-plugins. It
      supports all gcc versions from 4.5 to 6.0. The scripts/gcc-plugin.sh script
      chooses the proper host compiler (gcc-4.7 can be built by either gcc or g++).
      This script also checks the availability of the included headers in
      scripts/gcc-plugins/gcc-common.h.
      
      The gcc-common.h header contains frequently included headers for GCC plugins
      and it has a compatibility layer for the supported gcc versions.
      
      The gcc-generate-*-pass.h headers automatically generate the registration
      structures for GIMPLE, SIMPLE_IPA, IPA and RTL passes.
      
      Note that 'make clean' keeps the *.so files (only the distclean or mrproper
      targets clean all) because they are needed for out-of-tree modules.
      
      Based on work created by the PaX Team.
      Signed-off-by: default avatarEmese Revfy <re.emese@gmail.com>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      6b90bd4b
  27. 28 May, 2016 1 commit
    • George Spelvin's avatar
      <linux/hash.h>: Add support for architecture-specific functions · 468a9428
      George Spelvin authored
      This is just the infrastructure; there are no users yet.
      
      This is modelled on CONFIG_ARCH_RANDOM; a CONFIG_ symbol declares
      the existence of <asm/hash.h>.
      
      That file may define its own versions of various functions, and define
      HAVE_* symbols (no CONFIG_ prefix!) to suppress the generic ones.
      
      Included is a self-test (in lib/test_hash.c) that verifies the basics.
      It is NOT in general required that the arch-specific functions compute
      the same thing as the generic, but if a HAVE_* symbol is defined with
      the value 1, then equality is tested.
      Signed-off-by: default avatarGeorge Spelvin <linux@sciencehorizons.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Andreas Schwab <schwab@linux-m68k.org>
      Cc: Philippe De Muyter <phdm@macq.eu>
      Cc: linux-m68k@lists.linux-m68k.org
      Cc: Alistair Francis <alistai@xilinx.com>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: uclinux-h8-devel@lists.sourceforge.jp
      468a9428
  28. 21 May, 2016 3 commits
    • Zhaoxiu Zeng's avatar
      lib/GCD.c: use binary GCD algorithm instead of Euclidean · fff7fb0b
      Zhaoxiu Zeng authored
      The binary GCD algorithm is based on the following facts:
      	1. If a and b are all evens, then gcd(a,b) = 2 * gcd(a/2, b/2)
      	2. If a is even and b is odd, then gcd(a,b) = gcd(a/2, b)
      	3. If a and b are all odds, then gcd(a,b) = gcd((a-b)/2, b) = gcd((a+b)/2, b)
      
      Even on x86 machines with reasonable division hardware, the binary
      algorithm runs about 25% faster (80% the execution time) than the
      division-based Euclidian algorithm.
      
      On platforms like Alpha and ARMv6 where division is a function call to
      emulation code, it's even more significant.
      
      There are two variants of the code here, depending on whether a fast
      __ffs (find least significant set bit) instruction is available.  This
      allows the unpredictable branches in the bit-at-a-time shifting loop to
      be eliminated.
      
      If fast __ffs is not available, the "even/odd" GCD variant is used.
      
      I use the following code to benchmark:
      
      	#include <stdio.h>
      	#include <stdlib.h>
      	#include <stdint.h>
      	#include <string.h>
      	#include <time.h>
      	#include <unistd.h>
      
      	#define swap(a, b) \
      		do { \
      			a ^= b; \
      			b ^= a; \
      			a ^= b; \
      		} while (0)
      
      	unsigned long gcd0(unsigned long a, unsigned long b)
      	{
      		unsigned long r;
      
      		if (a < b) {
      			swap(a, b);
      		}
      
      		if (b == 0)
      			return a;
      
      		while ((r = a % b) != 0) {
      			a = b;
      			b = r;
      		}
      
      		return b;
      	}
      
      	unsigned long gcd1(unsigned long a, unsigned long b)
      	{
      		unsigned long r = a | b;
      
      		if (!a || !b)
      			return r;
      
      		b >>= __builtin_ctzl(b);
      
      		for (;;) {
      			a >>= __builtin_ctzl(a);
      			if (a == b)
      				return a << __builtin_ctzl(r);
      
      			if (a < b)
      				swap(a, b);
      			a -= b;
      		}
      	}
      
      	unsigned long gcd2(unsigned long a, unsigned long b)
      	{
      		unsigned long r = a | b;
      
      		if (!a || !b)
      			return r;
      
      		r &= -r;
      
      		while (!(b & r))
      			b >>= 1;
      
      		for (;;) {
      			while (!(a & r))
      				a >>= 1;
      			if (a == b)
      				return a;
      
      			if (a < b)
      				swap(a, b);
      			a -= b;
      			a >>= 1;
      			if (a & r)
      				a += b;
      			a >>= 1;
      		}
      	}
      
      	unsigned long gcd3(unsigned long a, unsigned long b)
      	{
      		unsigned long r = a | b;
      
      		if (!a || !b)
      			return r;
      
      		b >>= __builtin_ctzl(b);
      		if (b == 1)
      			return r & -r;
      
      		for (;;) {
      			a >>= __builtin_ctzl(a);
      			if (a == 1)
      				return r & -r;
      			if (a == b)
      				return a << __builtin_ctzl(r);
      
      			if (a < b)
      				swap(a, b);
      			a -= b;
      		}
      	}
      
      	unsigned long gcd4(unsigned long a, unsigned long b)
      	{
      		unsigned long r = a | b;
      
      		if (!a || !b)
      			return r;
      
      		r &= -r;
      
      		while (!(b & r))
      			b >>= 1;
      		if (b == r)
      			return r;
      
      		for (;;) {
      			while (!(a & r))
      				a >>= 1;
      			if (a == r)
      				return r;
      			if (a == b)
      				return a;
      
      			if (a < b)
      				swap(a, b);
      			a -= b;
      			a >>= 1;
      			if (a & r)
      				a += b;
      			a >>= 1;
      		}
      	}
      
      	static unsigned long (*gcd_func[])(unsigned long a, unsigned long b) = {
      		gcd0, gcd1, gcd2, gcd3, gcd4,
      	};
      
      	#define TEST_ENTRIES (sizeof(gcd_func) / sizeof(gcd_func[0]))
      
      	#if defined(__x86_64__)
      
      	#define rdtscll(val) do { \
      		unsigned long __a,__d; \
      		__asm__ __volatile__("rdtsc" : "=a" (__a), "=d" (__d)); \
      		(val) = ((unsigned long long)__a) | (((unsigned long long)__d)<<32); \
      	} while(0)
      
      	static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
      								unsigned long a, unsigned long b, unsigned long *res)
      	{
      		unsigned long long start, end;
      		unsigned long long ret;
      		unsigned long gcd_res;
      
      		rdtscll(start);
      		gcd_res = gcd(a, b);
      		rdtscll(end);
      
      		if (end >= start)
      			ret = end - start;
      		else
      			ret = ~0ULL - start + 1 + end;
      
      		*res = gcd_res;
      		return ret;
      	}
      
      	#else
      
      	static inline struct timespec read_time(void)
      	{
      		struct timespec time;
      		clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time);
      		return time;
      	}
      
      	static inline unsigned long long diff_time(struct timespec start, struct timespec end)
      	{
      		struct timespec temp;
      
      		if ((end.tv_nsec - start.tv_nsec) < 0) {
      			temp.tv_sec = end.tv_sec - start.tv_sec - 1;
      			temp.tv_nsec = 1000000000ULL + end.tv_nsec - start.tv_nsec;
      		} else {
      			temp.tv_sec = end.tv_sec - start.tv_sec;
      			temp.tv_nsec = end.tv_nsec - start.tv_nsec;
      		}
      
      		return temp.tv_sec * 1000000000ULL + temp.tv_nsec;
      	}
      
      	static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
      								unsigned long a, unsigned long b, unsigned long *res)
      	{
      		struct timespec start, end;
      		unsigned long gcd_res;
      
      		start = read_time();
      		gcd_res = gcd(a, b);
      		end = read_time();
      
      		*res = gcd_res;
      		return diff_time(start, end);
      	}
      
      	#endif
      
      	static inline unsigned long get_rand()
      	{
      		if (sizeof(long) == 8)
      			return (unsigned long)rand() << 32 | rand();
      		else
      			return rand();
      	}
      
      	int main(int argc, char **argv)
      	{
      		unsigned int seed = time(0);
      		int loops = 100;
      		int repeats = 1000;
      		unsigned long (*res)[TEST_ENTRIES];
      		unsigned long long elapsed[TEST_ENTRIES];
      		int i, j, k;
      
      		for (;;) {
      			int opt = getopt(argc, argv, "n:r:s:");
      			/* End condition always first */
      			if (opt == -1)
      				break;
      
      			switch (opt) {
      			case 'n':
      				loops = atoi(optarg);
      				break;
      			case 'r':
      				repeats = atoi(optarg);
      				break;
      			case 's':
      				seed = strtoul(optarg, NULL, 10);
      				break;
      			default:
      				/* You won't actually get here. */
      				break;
      			}
      		}
      
      		res = malloc(sizeof(unsigned long) * TEST_ENTRIES * loops);
      		memset(elapsed, 0, sizeof(elapsed));
      
      		srand(seed);
      		for (j = 0; j < loops; j++) {
      			unsigned long a = get_rand();
      			/* Do we have args? */
      			unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
      			unsigned long long min_elapsed[TEST_ENTRIES];
      			for (k = 0; k < repeats; k++) {
      				for (i = 0; i < TEST_ENTRIES; i++) {
      					unsigned long long tmp = benchmark_gcd_func(gcd_func[i], a, b, &res[j][i]);
      					if (k == 0 || min_elapsed[i] > tmp)
      						min_elapsed[i] = tmp;
      				}
      			}
      			for (i = 0; i < TEST_ENTRIES; i++)
      				elapsed[i] += min_elapsed[i];
      		}
      
      		for (i = 0; i < TEST_ENTRIES; i++)
      			printf("gcd%d: elapsed %llu\n", i, elapsed[i]);
      
      		k = 0;
      		srand(seed);
      		for (j = 0; j < loops; j++) {
      			unsigned long a = get_rand();
      			unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
      			for (i = 1; i < TEST_ENTRIES; i++) {
      				if (res[j][i] != res[j][0])
      					break;
      			}
      			if (i < TEST_ENTRIES) {
      				if (k == 0) {
      					k = 1;
      					fprintf(stderr, "Error:\n");
      				}
      				fprintf(stderr, "gcd(%lu, %lu): ", a, b);
      				for (i = 0; i < TEST_ENTRIES; i++)
      					fprintf(stderr, "%ld%s", res[j][i], i < TEST_ENTRIES - 1 ? ", " : "\n");
      			}
      		}
      
      		if (k == 0)
      			fprintf(stderr, "PASS\n");
      
      		free(res);
      
      		return 0;
      	}
      
      Compiled with "-O2", on "VirtualBox 4.4.0-22-generic #38-Ubuntu x86_64" got:
      
        zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
        gcd0: elapsed 10174
        gcd1: elapsed 2120
        gcd2: elapsed 2902
        gcd3: elapsed 2039
        gcd4: elapsed 2812
        PASS
        zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
        gcd0: elapsed 9309
        gcd1: elapsed 2280
        gcd2: elapsed 2822
        gcd3: elapsed 2217
        gcd4: elapsed 2710
        PASS
        zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
        gcd0: elapsed 9589
        gcd1: elapsed 2098
        gcd2: elapsed 2815
        gcd3: elapsed 2030
        gcd4: elapsed 2718
        PASS
        zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
        gcd0: elapsed 9914
        gcd1: elapsed 2309
        gcd2: elapsed 2779
        gcd3: elapsed 2228
        gcd4: elapsed 2709
        PASS
      
      [akpm@linux-foundation.org: avoid #defining a CONFIG_ variable]
      Signed-off-by: default avatarZhaoxiu Zeng <zhaoxiu.zeng@gmail.com>
      Signed-off-by: default avatarGeorge Spelvin <linux@horizon.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fff7fb0b
    • Petr Mladek's avatar
      printk/nmi: generic solution for safe printk in NMI · 42a0bb3f
      Petr Mladek authored
      printk() takes some locks and could not be used a safe way in NMI
      context.
      
      The chance of a deadlock is real especially when printing stacks from
      all CPUs.  This particular problem has been addressed on x86 by the
      commit a9edc880 ("x86/nmi: Perform a safe NMI stack trace on all
      CPUs").
      
      The patchset brings two big advantages.  First, it makes the NMI
      backtraces safe on all architectures for free.  Second, it makes all NMI
      messages almost safe on all architectures (the temporary buffer is
      limited.  We still should keep the number of messages in NMI context at
      minimum).
      
      Note that there already are several messages printed in NMI context:
      WARN_ON(in_nmi()), BUG_ON(in_nmi()), anything being printed out from MCE
      handlers.  These are not easy to avoid.
      
      This patch reuses most of the code and makes it generic.  It is useful
      for all messages and architectures that support NMI.
      
      The alternative printk_func is set when entering and is reseted when
      leaving NMI context.  It queues IRQ work to copy the messages into the
      main ring buffer in a safe context.
      
      __printk_nmi_flush() copies all available messages and reset the buffer.
      Then we could use a simple cmpxchg operations to get synchronized with
      writers.  There is also used a spinlock to get synchronized with other
      flushers.
      
      We do not longer use seq_buf because it depends on external lock.  It
      would be hard to make all supported operations safe for a lockless use.
      It would be confusing and error prone to make only some operations safe.
      
      The code is put into separate printk/nmi.c as suggested by Steven
      Rostedt.  It needs a per-CPU buffer and is compiled only on
      architectures that call nmi_enter().  This is achieved by the new
      HAVE_NMI Kconfig flag.
      
      The are MN10300 and Xtensa architectures.  We need to clean up NMI
      handling there first.  Let's do it separately.
      
      The patch is heavily based on the draft from Peter Zijlstra, see
      
        https://lkml.org/lkml/2015/6/10/327
      
      [arnd@arndb.de: printk-nmi: use %zu format string for size_t]
      [akpm@linux-foundation.org: min_t->min - all types are size_t here]
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Suggested-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Jan Kara <jack@suse.cz>
      Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>	[arm part]
      Cc: Daniel Thompson <daniel.thompson@linaro.org>
      Cc: Jiri Kosina <jkosina@suse.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Daniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      42a0bb3f
    • Jiri Slaby's avatar
      exit_thread: remove empty bodies · 5f56a5df
      Jiri Slaby authored
      Define HAVE_EXIT_THREAD for archs which want to do something in
      exit_thread. For others, let's define exit_thread as an empty inline.
      
      This is a cleanup before we change the prototype of exit_thread to
      accept a task parameter.
      
      [akpm@linux-foundation.org: fix mips]
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5f56a5df
  29. 29 Feb, 2016 1 commit
  30. 21 Jan, 2016 2 commits
    • Christoph Hellwig's avatar
      dma-mapping: always provide the dma_map_ops based implementation · e1c7e324
      Christoph Hellwig authored
      Move the generic implementation to <linux/dma-mapping.h> now that all
      architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now
      that everyone supports them.
      
      [valentinrothberg@gmail.com: remove leftovers in Kconfig]
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Helge Deller <deller@gmx.de>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: default avatarValentin Rothberg <valentinrothberg@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e1c7e324
    • Christoph Hellwig's avatar
      dma-mapping: make the generic coherent dma mmap implementation optional · 0d4a619b
      Christoph Hellwig authored
      This series converts all remaining architectures to use dma_map_ops and
      the generic implementation of the DMA API.  This not only simplifies the
      code a lot, but also prepares for possible future changes like more
      generic non-iommu dma_ops implementations or generic per-device
      dma_map_ops.
      
      This patch (of 16):
      
      We have a couple architectures that do not want to support this code, so
      add another Kconfig symbol that disables the code similar to what we do
      for the nommu case.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0d4a619b
  31. 15 Jan, 2016 1 commit
    • Daniel Cashman's avatar
      mm: mmap: add new /proc tunable for mmap_base ASLR · d07e2259
      Daniel Cashman authored
      Address Space Layout Randomization (ASLR) provides a barrier to
      exploitation of user-space processes in the presence of security
      vulnerabilities by making it more difficult to find desired code/data
      which could help an attack.  This is done by adding a random offset to
      the location of regions in the process address space, with a greater
      range of potential offset values corresponding to better protection/a
      larger search-space for brute force, but also to greater potential for
      fragmentation.
      
      The offset added to the mmap_base address, which provides the basis for
      the majority of the mappings for a process, is set once on process exec
      in arch_pick_mmap_layout() and is done via hard-coded per-arch values,
      which reflect, hopefully, the best compromise for all systems.  The
      trade-off between increased entropy in the offset value generation and
      the corresponding increased variability in address space fragmentation
      is not absolute, however, and some platforms may tolerate higher amounts
      of entropy.  This patch introduces both new Kconfig values and a sysctl
      interface which may be used to change the amount of entropy used for
      offset generation on a system.
      
      The direct motivation for this change was in response to the
      libstagefright vulnerabilities that affected Android, specifically to
      information provided by Google's project zero at:
      
        http://googleprojectzero.blogspot.com/2015/09/stagefrightened.html
      
      The attack presented therein, by Google's project zero, specifically
      targeted the limited randomness used to generate the offset added to the
      mmap_base address in order to craft a brute-force-based attack.
      Concretely, the attack was against the mediaserver process, which was
      limited to respawning every 5 seconds, on an arm device.  The hard-coded
      8 bits used resulted in an average expected success rate of defeating
      the mmap ASLR after just over 10 minutes (128 tries at 5 seconds a
      piece).  With this patch, and an accompanying increase in the entropy
      value to 16 bits, the same attack would take an average expected time of
      over 45 hours (32768 tries), which makes it both less feasible and more
      likely to be noticed.
      
      The introduced Kconfig and sysctl options are limited by per-arch
      minimum and maximum values, the minimum of which was chosen to match the
      current hard-coded value and the maximum of which was chosen so as to
      give the greatest flexibility without generating an invalid mmap_base
      address, generally a 3-4 bits less than the number of bits in the
      user-space accessible virtual address space.
      
      When decided whether or not to change the default value, a system
      developer should consider that mmap_base address could be placed
      anywhere up to 2^(value) bits away from the non-randomized location,
      which would introduce variable-sized areas above and below the mmap_base
      address such that the maximum vm_area_struct size may be reduced,
      preventing very large allocations.
      
      This patch (of 4):
      
      ASLR only uses as few as 8 bits to generate the random offset for the
      mmap base address on 32 bit architectures.  This value was chosen to
      prevent a poorly chosen value from dividing the address space in such a
      way as to prevent large allocations.  This may not be an issue on all
      platforms.  Allow the specification of a minimum number of bits so that
      platforms desiring greater ASLR protection may determine where to place
      the trade-off.
      Signed-off-by: default avatarDaniel Cashman <dcashman@google.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Mark Salyzyn <salyzyn@android.com>
      Cc: Jeff Vander Stoep <jeffv@google.com>
      Cc: Nick Kralevich <nnk@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d07e2259