1. 10 Nov, 2018 1 commit
    • Wenwen Wang's avatar
      net: socket: fix a missing-check bug · f57ef24f
      Wenwen Wang authored
      [ Upstream commit b6168562c8ce2bd5a30e213021650422e08764dc ]
      
      In ethtool_ioctl(), the ioctl command 'ethcmd' is checked through a switch
      statement to see whether it is necessary to pre-process the ethtool
      structure, because, as mentioned in the comment, the structure
      ethtool_rxnfc is defined with padding. If yes, a user-space buffer 'rxnfc'
      is allocated through compat_alloc_user_space(). One thing to note here is
      that, if 'ethcmd' is ETHTOOL_GRXCLSRLALL, the size of the buffer 'rxnfc' is
      partially determined by 'rule_cnt', which is actually acquired from the
      user-space buffer 'compat_rxnfc', i.e., 'compat_rxnfc->rule_cnt', through
      get_user(). After 'rxnfc' is allocated, the data in the original user-space
      buffer 'compat_rxnfc' is then copied to 'rxnfc' through copy_in_user(),
      including the 'rule_cnt' field. However, after this copy, no check is
      re-enforced on 'rxnfc->rule_cnt'. So it is possible that a malicious user
      race to change the value in the 'compat_rxnfc->rule_cnt' between these two
      copies. Through this way, the attacker can bypass the previous check on
      'rule_cnt' and inject malicious data. This can cause undefined behavior of
      the kernel and introduce potential security risk.
      
      This patch avoids the above issue via copying the value acquired by
      get_user() to 'rxnfc->rule_cn', if 'ethcmd' is ETHTOOL_GRXCLSRLALL.
      Signed-off-by: default avatarWenwen Wang <wang6495@umn.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f57ef24f
  2. 06 Aug, 2018 1 commit
  3. 31 Jan, 2018 1 commit
    • Alexei Starovoitov's avatar
      bpf: introduce BPF_JIT_ALWAYS_ON config · a3d6dd6a
      Alexei Starovoitov authored
      [ upstream commit 290af866 ]
      
      The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.
      
      A quote from goolge project zero blog:
      "At this point, it would normally be necessary to locate gadgets in
      the host kernel code that can be used to actually leak data by reading
      from an attacker-controlled location, shifting and masking the result
      appropriately and then using the result of that as offset to an
      attacker-controlled address for a load. But piecing gadgets together
      and figuring out which ones work in a speculation context seems annoying.
      So instead, we decided to use the eBPF interpreter, which is built into
      the host kernel - while there is no legitimate way to invoke it from inside
      a VM, the presence of the code in the host kernel's text section is sufficient
      to make it usable for the attack, just like with ordinary ROP gadgets."
      
      To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
      option that removes interpreter from the kernel in favor of JIT-only mode.
      So far eBPF JIT is supported by:
      x64, arm64, arm32, sparc64, s390, powerpc64, mips64
      
      The start of JITed program is randomized and code page is marked as read-only.
      In addition "constant blinding" can be turned on with net.core.bpf_jit_harden
      
      v2->v3:
      - move __bpf_prog_ret0 under ifdef (Daniel)
      
      v1->v2:
      - fix init order, test_bpf and cBPF (Daniel's feedback)
      - fix offloaded bpf (Jakub's feedback)
      - add 'return 0' dummy in case something can invoke prog->bpf_func
      - retarget bpf tree. For bpf-next the patch would need one extra hunk.
        It will be sent when the trees are merged back to net-next
      
      Considered doing:
        int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
      but it seems better to land the patch as-is and in bpf-next remove
      bpf_jit_enable global variable from all JITs, consolidate in one place
      and remove this jit_init() function.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a3d6dd6a
  4. 20 Dec, 2017 1 commit
  5. 26 Feb, 2017 1 commit
  6. 17 Nov, 2016 1 commit
    • Andreas Gruenbacher's avatar
      xattr: Fix setting security xattrs on sockfs · 4a590153
      Andreas Gruenbacher authored
      The IOP_XATTR flag is set on sockfs because sockfs supports getting the
      "system.sockprotoname" xattr.  Since commit 6c6ef9f2, this flag is checked for
      setxattr support as well.  This is wrong on sockfs because security xattr
      support there is supposed to be provided by security_inode_setsecurity.  The
      smack security module relies on socket labels (xattrs).
      
      Fix this by adding a security xattr handler on sockfs that returns
      -EAGAIN, and by checking for -EAGAIN in setxattr.
      
      We cannot simply check for -EOPNOTSUPP in setxattr because there are
      filesystems that neither have direct security xattr support nor support
      via security_inode_setsecurity.  A more proper fix might be to move the
      call to security_inode_setsecurity into sockfs, but it's not clear to me
      if that is safe: we would end up calling security_inode_post_setxattr after
      that as well.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      4a590153
  7. 09 Nov, 2016 1 commit
  8. 08 Oct, 2016 1 commit
  9. 07 Oct, 2016 2 commits
  10. 20 May, 2016 1 commit
  11. 28 Apr, 2016 1 commit
  12. 11 Apr, 2016 1 commit
  13. 07 Apr, 2016 1 commit
  14. 04 Apr, 2016 1 commit
    • Soheil Hassas Yeganeh's avatar
      sock: enable timestamping using control messages · c14ac945
      Soheil Hassas Yeganeh authored
      Currently, SOL_TIMESTAMPING can only be enabled using setsockopt.
      This is very costly when users want to sample writes to gather
      tx timestamps.
      
      Add support for enabling SO_TIMESTAMPING via control messages by
      using tsflags added in `struct sockcm_cookie` (added in the previous
      patches in this series) to set the tx_flags of the last skb created in
      a sendmsg. With this patch, the timestamp recording bits in tx_flags
      of the skbuff is overridden if SO_TIMESTAMPING is passed in a cmsg.
      
      Please note that this is only effective for overriding the recording
      timestamps flags. Users should enable timestamp reporting (e.g.,
      SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_OPT_ID) using
      socket options and then should ask for SOF_TIMESTAMPING_TX_*
      using control messages per sendmsg to sample timestamps for each
      write.
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c14ac945
  15. 28 Mar, 2016 1 commit
  16. 14 Mar, 2016 2 commits
  17. 09 Mar, 2016 3 commits
  18. 15 Jan, 2016 1 commit
    • Vladimir Davydov's avatar
      kmemcg: account certain kmem allocations to memcg · 5d097056
      Vladimir Davydov authored
      Mark those kmem allocations that are known to be easily triggered from
      userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
      memcg.  For the list, see below:
      
       - threadinfo
       - task_struct
       - task_delay_info
       - pid
       - cred
       - mm_struct
       - vm_area_struct and vm_region (nommu)
       - anon_vma and anon_vma_chain
       - signal_struct
       - sighand_struct
       - fs_struct
       - files_struct
       - fdtable and fdtable->full_fds_bits
       - dentry and external_name
       - inode for all filesystems. This is the most tedious part, because
         most filesystems overwrite the alloc_inode method.
      
      The list is far from complete, so feel free to add more objects.
      Nevertheless, it should be close to "account everything" approach and
      keep most workloads within bounds.  Malevolent users will be able to
      breach the limit, but this was possible even with the former "account
      everything" approach (simply because it did not account everything in
      fact).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5d097056
  19. 11 Jan, 2016 1 commit
  20. 30 Dec, 2015 1 commit
    • Nicolai Stange's avatar
      net, socket, socket_wq: fix missing initialization of flags · 574aab1e
      Nicolai Stange authored
      Commit ceb5d58b ("net: fix sock_wake_async() rcu protection") from
      the current 4.4 release cycle introduced a new flags member in
      struct socket_wq and moved SOCKWQ_ASYNC_NOSPACE and SOCKWQ_ASYNC_WAITDATA
      from struct socket's flags member into that new place.
      
      Unfortunately, the new flags field is never initialized properly, at least
      not for the struct socket_wq instance created in sock_alloc_inode().
      
      One particular issue I encountered because of this is that my GNU Emacs
      failed to draw anything on my desktop -- i.e. what I got is a transparent
      window, including the title bar. Bisection lead to the commit mentioned
      above and further investigation by means of strace told me that Emacs
      is indeed speaking to my Xorg through an O_ASYNC AF_UNIX socket. This is
      reproducible 100% of times and the fact that properly initializing the
      struct socket_wq ->flags fixes the issue leads me to the conclusion that
      somehow SOCKWQ_ASYNC_WAITDATA got set in the uninitialized ->flags,
      preventing my Emacs from receiving any SIGIO's due to data becoming
      available and it got stuck.
      
      Make sock_alloc_inode() set the newly created struct socket_wq's ->flags
      member to zero.
      
      Fixes: ceb5d58b ("net: fix sock_wake_async() rcu protection")
      Signed-off-by: default avatarNicolai Stange <nicstange@gmail.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      574aab1e
  21. 15 Dec, 2015 1 commit
  22. 01 Dec, 2015 2 commits
    • Eric Dumazet's avatar
      net: fix sock_wake_async() rcu protection · ceb5d58b
      Eric Dumazet authored
      Dmitry provided a syzkaller (http://github.com/google/syzkaller)
      triggering a fault in sock_wake_async() when async IO is requested.
      
      Said program stressed af_unix sockets, but the issue is generic
      and should be addressed in core networking stack.
      
      The problem is that by the time sock_wake_async() is called,
      we should not access the @flags field of 'struct socket',
      as the inode containing this socket might be freed without
      further notice, and without RCU grace period.
      
      We already maintain an RCU protected structure, "struct socket_wq"
      so moving SOCKWQ_ASYNC_NOSPACE & SOCKWQ_ASYNC_WAITDATA into it
      is the safe route.
      
      It also reduces number of cache lines needing dirtying, so might
      provide a performance improvement anyway.
      
      In followup patches, we might move remaining flags (SOCK_NOSPACE,
      SOCK_PASSCRED, SOCK_PASSSEC) to save 8 bytes and let 'struct socket'
      being mostly read and let it being shared between cpus.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ceb5d58b
    • Eric Dumazet's avatar
      net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA · 9cd3e072
      Eric Dumazet authored
      This patch is a cleanup to make following patch easier to
      review.
      
      Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
      from (struct socket)->flags to a (struct socket_wq)->flags
      to benefit from RCU protection in sock_wake_async()
      
      To ease backports, we rename both constants.
      
      Two new helpers, sk_set_bit(int nr, struct sock *sk)
      and sk_clear_bit(int net, struct sock *sk) are added so that
      following patch can change their implementation.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9cd3e072
  23. 29 Sep, 2015 1 commit
  24. 11 May, 2015 2 commits
  25. 15 Apr, 2015 1 commit
  26. 12 Apr, 2015 1 commit
  27. 11 Apr, 2015 2 commits
  28. 09 Apr, 2015 3 commits
  29. 23 Mar, 2015 1 commit
  30. 20 Mar, 2015 1 commit
  31. 13 Mar, 2015 1 commit