Skip to content
  • Chris Snook's avatar
    [PATCH] enforce RLIMIT_NOFILE in poll() · 4e6fd33b
    Chris Snook authored
    
    
    POSIX states that poll() shall fail with EINVAL if nfds > OPEN_MAX.  In
    this context, POSIX is referring to sysconf(OPEN_MAX), which is the value
    of current->signal->rlim[RLIMIT_NOFILE].rlim_cur in the linux kernel, not
    the compile-time constant which happens to also be named OPEN_MAX.  In the
    current code, an application may poll up to max_fdset file descriptors,
    even if this exceeds RLIMIT_NOFILE.  The current code also breaks
    applications which poll more than max_fdset descriptors, which worked circa
    2.4.18 when the check was against NR_OPEN, which is 1024*1024.  This patch
    enforces the limit precisely as POSIX defines, even if RLIMIT_NOFILE has
    been changed at run time with ulimit -n.
    
    To elaborate on the rationale for this, there are three cases:
    
    1) RLIMIT_NOFILE is at the default value of 1024
    
    In this (default) case, the patch changes nothing.  Calls with nfds > 1024
    fail with EINVAL both before and after the patch, and calls with nfds <=
    1024 pass the check both before and after the patch, since 1024 is the
    initial value of max_fdset.
    
    2) RLIMIT_NOFILE has been raised above the default
    
    In this case, poll() becomes more permissive, allowing polling up to
    RLIMIT_NOFILE file descriptors even if less than 1024 have been opened.
    The patch won't introduce new errors here.  If an application somehow
    depends on poll() failing when it polls with duplicate or invalid file
    descriptors, it's already broken, since this is already allowed below 1024,
    and will also work above 1024 if enough file descriptors have been open at
    some point to cause max_fdset to have been increased above nfds.
    
    3) RLIMIT_NOFILE has been lowered below the default
    
    In this case, the system administrator or the user has gone out of their
    way to protect the system from inefficient (or malicious) applications
    wasting kernel memory.  The current code allows polling up to 1024 file
    descriptors even if RLIMIT_NOFILE is much lower, which is not what the user
    or administrator intended.  Well-written applications which only poll
    valid, unique file descriptors will never notice the difference, because
    they'll hit the limit on open() first.  If an application gets broken
    because of the patch in this case, then it was already poorly/maliciously
    designed, and allowing it to work in the past was a violation of POSIX and
    a DoS risk on low-resource systems.
    
    With this patch, poll() will permit exactly what POSIX suggests, no more,
    no less, and for any run-time value set with ulimit -n, not just 256 or
    1024.  There are existing apps which which poll a large number of file
    descriptors, some of which may be invalid, and if those numbers stradle
    1024, they currently fail with or without the patch in -mm, though they
    worked fine under 2.4.18.
    
    Signed-off-by: default avatarChris Snook <csnook@redhat.com>
    Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    4e6fd33b