1. 25 Oct, 2013 1 commit
  2. 05 Aug, 2013 1 commit
  3. 23 Feb, 2013 1 commit
  4. 09 Oct, 2012 1 commit
  5. 27 Sep, 2012 4 commits
  6. 31 Jul, 2012 1 commit
  7. 30 May, 2012 1 commit
  8. 03 May, 2012 1 commit
  9. 19 Feb, 2012 1 commit
    • David Howells's avatar
      Wrap accesses to the fd_sets in struct fdtable · 1dce27c5
      David Howells authored
      Wrap accesses to the fd_sets in struct fdtable (for recording open files and
      close-on-exec flags) so that we can move away from using fd_sets since we
      abuse the fd_set structs by not allocating the full-sized structure under
      normal circumstances and by non-core code looking at the internals of the
      The first abuse means that use of FD_ZERO() on these fd_sets is not permitted,
      since that cannot be told about their abnormal lengths.
      This introduces six wrapper functions for setting, clearing and testing
      close-on-exec flags and fd-is-open flags:
      	void __set_close_on_exec(int fd, struct fdtable *fdt);
      	void __clear_close_on_exec(int fd, struct fdtable *fdt);
      	bool close_on_exec(int fd, const struct fdtable *fdt);
      	void __set_open_fd(int fd, struct fdtable *fdt);
      	void __clear_open_fd(int fd, struct fdtable *fdt);
      	bool fd_is_open(int fd, const struct fdtable *fdt);
      Note that I've prepended '__' to the names of the set/clear functions because
      they require the caller to hold a lock to use them.
      Note also that I haven't added wrappers for looking behind the scenes at the
      the array.  Possibly that should exist too.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Link: http://lkml.kernel.org/r/20120216174942.23314.1364.stgit@warthog.procyon.org.ukSigned-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
  10. 24 Mar, 2011 1 commit
  11. 15 Mar, 2011 1 commit
    • Al Viro's avatar
      New kind of open files - "location only". · 1abf0c71
      Al Viro authored
      New flag for open(2) - O_PATH.  Semantics:
      	* pathname is resolved, but the file itself is _NOT_ opened
      as far as filesystem is concerned.
      	* almost all operations on the resulting descriptors shall
      fail with -EBADF.  Exceptions are:
      	1) operations on descriptors themselves (i.e.
      		close(), dup(), dup2(), dup3(), fcntl(fd, F_DUPFD),
      		fcntl(fd, F_DUPFD_CLOEXEC, ...), fcntl(fd, F_GETFD),
      		fcntl(fd, F_SETFD, ...))
      	2) fcntl(fd, F_GETFL), for a common non-destructive way to
      		check if descriptor is open
      	3) "dfd" arguments of ...at(2) syscalls, i.e. the starting
      		points of pathname resolution
      	* closing such descriptor does *NOT* affect dnotify or
      posix locks.
      	* permissions are checked as usual along the way to file;
      no permission checks are applied to the file itself.  Of course,
      giving such thing to syscall will result in permission checks (at
      the moment it means checking that starting point of ....at() is
      a directory and caller has exec permissions on it).
      fget() and fget_light() return NULL on such descriptors; use of
      fget_raw() and fget_raw_light() is needed to get them.  That protects
      existing code from dealing with those things.
      There are two things still missing (they come in the next commits):
      one is handling of symlinks (right now we refuse to open them that
      way; see the next commit for semantics related to those) and another
      is descriptor passing via SCM_RIGHTS datagrams.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
  12. 03 Feb, 2011 1 commit
  13. 28 Oct, 2010 1 commit
    • Linus Torvalds's avatar
      fasync: Fix placement of FASYNC flag comment · 55f335a8
      Linus Torvalds authored
      In commit f7347ce4 ("fasync: re-organize fasync entry insertion to
      allow it under a spinlock") Arnd took an earlier patch of mine that had
      the comment about the FASYNC flag above the wrong function.
      When the fasync_add_entry() function was split to introduce the new
      fasync_insert_entry() helper function, the code that actually cares
      about the FASYNC bit moved to that new helper.
      So just move the comment to the right point.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  14. 27 Oct, 2010 1 commit
  15. 10 Sep, 2010 1 commit
  16. 11 Aug, 2010 1 commit
  17. 29 Jun, 2010 1 commit
  18. 04 Jun, 2010 1 commit
  19. 21 May, 2010 1 commit
  20. 21 Apr, 2010 1 commit
    • Eric Dumazet's avatar
      fasync: RCU and fine grained locking · 989a2979
      Eric Dumazet authored
      kill_fasync() uses a central rwlock, candidate for RCU conversion, to
      avoid cache line ping pongs on SMP.
      fasync_remove_entry() and fasync_add_entry() can disable IRQS on a short
      section instead during whole list scan.
      Use a spinlock per fasync_struct to synchronize kill_fasync_rcu() and
      fasync_{remove|add}_entry(). This spinlock is IRQ safe, so sock_fasync()
      doesnt need its own implementation and can use fasync_helper(), to
      reduce code size and complexity.
      We can remove __kill_fasync() direct use in net/socket.c, and rename it
      to kill_fasync_rcu().
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
  21. 06 Mar, 2010 1 commit
  22. 07 Feb, 2010 1 commit
    • Linus Torvalds's avatar
      Fix race in tty_fasync() properly · 80e1e823
      Linus Torvalds authored
      This reverts commit 70362511 ("tty: fix race in tty_fasync") and
      commit b04da8bf ("fnctl: f_modown should call write_lock_irqsave/
      restore") that tried to fix up some of the fallout but was incomplete.
      It turns out that we really cannot hold 'tty->ctrl_lock' over calling
      __f_setown, because not only did that cause problems with interrupt
      disables (which the second commit fixed), it also causes a potential
      ABBA deadlock due to lock ordering.
      Thanks to Tetsuo Handa for following up on the issue, and running
      lockdep to show the problem.  It goes roughly like this:
       - f_getown gets filp->f_owner.lock for reading without interrupts
         disabled, so an interrupt that happens while that lock is held can
         cause a lockdep chain from f_owner.lock -> sighand->siglock.
       - at the same time, the tty->ctrl_lock -> f_owner.lock chain that
         commit 70362511 introduced, together with the pre-existing
         sighand->siglock -> tty->ctrl_lock chain means that we have a lock
         dependency the other way too.
      So instead of extending tty->ctrl_lock over the whole __f_setown() call,
      we now just take a reference to the 'pid' structure while holding the
      lock, and then release it after having done the __f_setown.  That still
      guarantees that 'struct pid' won't go away from under us, which is all
      we really ever needed.
      Reported-and-tested-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Acked-by: default avatarAmérico Wang <xiyou.wangcong@gmail.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  23. 27 Jan, 2010 1 commit
  24. 16 Dec, 2009 1 commit
    • Linus Torvalds's avatar
      fasync: split 'fasync_helper()' into separate add/remove functions · 53281b6d
      Linus Torvalds authored
      Yes, the add and remove cases do share the same basic loop and the
      locking, but the compiler can inline and then CSE some of the end result
      anyway.  And splitting it up makes the code way easier to follow,
      and makes it clearer exactly what the semantics are.
      In particular, we must make sure that the FASYNC flag in file->f_flags
      exactly matches the state of "is this file on any fasync list", since
      not only is that flag visible to user space (F_GETFL), but we also use
      that flag to check whether we need to remove any fasync entries on file
      We got that wrong for the case of a mixed use of file locking (which
      tries to remove any fasync entries for file leases) and fasync.
      Splitting the function up also makes it possible to do some future
      optimizations without making the function even messier.  In particular,
      since the FASYNC flag has to match the state of "is this on a list", we
      can do the following future optimizations:
       - on remove, we don't even need to get the locks and traverse the list
         if FASYNC isn't set, since we can know a priori that there is no
         point (this is effectively the same optimization that we already do
         in __fput() wrt removing fasync on file close)
       - on add, we can use the FASYNC flag to decide whether we are changing
         an existing entry or need to allocate a new one.
      but this is just the cleanup + fix for the FASYNC flag.
      Acked-by: default avatarAl Viro <viro@ZenIV.linux.org.uk>
      Tested-by: default avatarTavis Ormandy <taviso@google.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  25. 18 Nov, 2009 1 commit
  26. 24 Sep, 2009 2 commits
    • Peter Zijlstra's avatar
      fcntl: add F_[SG]ETOWN_EX · ba0a6c9f
      Peter Zijlstra authored
      In order to direct the SIGIO signal to a particular thread of a
      multi-threaded application we cannot, like suggested by the manpage, put a
      TID into the regular fcntl(F_SETOWN) call.  It will still be send to the
      whole process of which that thread is part.
      Since people do want to properly direct SIGIO we introduce F_SETOWN_EX.
      The need to direct SIGIO comes from self-monitoring profiling such as with
      perf-counters.  Perf-counters uses SIGIO to notify that new sample data is
      available.  If the signal is delivered to the same task that generated the
      new sample it can augment that data by inspecting the task's user-space
      state right after it returns from the kernel.  This is esp.  convenient
      for interpreted or virtual machine driven environments.
      Both F_SETOWN_EX and F_GETOWN_EX take a pointer to a struct f_owner_ex
      as argument:
      struct f_owner_ex {
      	int   type;
      	pid_t pid;
      Where type is one of F_OWNER_TID, F_OWNER_PID or F_OWNER_GID.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Reviewed-by: default avatarOleg Nesterov <oleg@redhat.com>
      Tested-by: default avatarstephane eranian <eranian@googlemail.com>
      Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Oleg Nesterov's avatar
      signals: send_sigio: use do_send_sig_info() to avoid check_kill_permission() · 06f1631a
      Oleg Nesterov authored
      group_send_sig_info()->check_kill_permission() assumes that current is the
      sender and uses current_cred().
      This is not true in send_sigio_to_task() case.  From the security pov the
      sender is not current, but the task which did fcntl(F_SETOWN), that is why
      we have sigio_perm() which uses the right creds to check.
      Fortunately, send_sigio() always sends either SEND_SIG_PRIV or
      SI_FROMKERNEL() signal, so check_kill_permission() does nothing.  But
      still it would be tidier to avoid this bogus security check and save a
      couple of cycles.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: stephane eranian <eranian@googlemail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Roland McGrath <roland@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  27. 12 Jul, 2009 1 commit
  28. 16 Jun, 2009 2 commits
  29. 11 May, 2009 1 commit
  30. 30 Mar, 2009 1 commit
  31. 16 Mar, 2009 3 commits
    • Jonathan Corbet's avatar
      Rationalize fasync return values · 60aa4924
      Jonathan Corbet authored
      Most fasync implementations do something like:
           return fasync_helper(...);
      But fasync_helper() will return a positive value at times - a feature used
      in at least one place.  Thus, a number of other drivers do:
           err = fasync_helper(...);
           if (err < 0)
                   return err;
           return 0;
      In the interests of consistency and more concise code, it makes sense to
      map positive return values onto zero where ->fasync() is called.
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
    • Jonathan Corbet's avatar
      Move FASYNC bit handling to f_op->fasync() · 76398425
      Jonathan Corbet authored
      Removing the BKL from FASYNC handling ran into the challenge of keeping the
      setting of the FASYNC bit in filp->f_flags atomic with regard to calls to
      the underlying fasync() function.  Andi Kleen suggested moving the handling
      of that bit into fasync(); this patch does exactly that.  As a result, we
      have a couple of internal API changes: fasync() must now manage the FASYNC
      bit, and it will be called without the BKL held.
      As it happens, every fasync() implementation in the kernel with one
      exception calls fasync_helper().  So, if we make fasync_helper() set the
      FASYNC bit, we can avoid making any changes to the other fasync()
      functions - as long as those functions, themselves, have proper locking.
      Most fasync() implementations do nothing but call fasync_helper() - which
      has its own lock - so they are easily verified as correct.  The BKL had
      already been pushed down into the rest.
      The networking code has its own version of fasync_helper(), so that code
      has been augmented with explicit FASYNC bit handling.
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: David Miller <davem@davemloft.net>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
    • Jonathan Corbet's avatar
      Use f_lock to protect f_flags · db1dd4d3
      Jonathan Corbet authored
      Traditionally, changes to struct file->f_flags have been done under BKL
      protection, or with no protection at all.  This patch causes all f_flags
      changes after file open/creation time to be done under protection of
      f_lock.  This allows the removal of some BKL usage and fixes a number of
      longstanding (if microscopic) races.
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
  32. 14 Jan, 2009 1 commit
  33. 05 Dec, 2008 1 commit