Skip to content
  • Ying Han's avatar
    mm: make get_user_pages() interruptible · 4779280d
    Ying Han authored
    
    
    The initial implementation of checking TIF_MEMDIE covers the cases of OOM
    killing.  If the process has been OOM killed, the TIF_MEMDIE is set and it
    return immediately.  This patch includes:
    
    1.  add the case that the SIGKILL is sent by user processes.  The
       process can try to get_user_pages() unlimited memory even if a user
       process has sent a SIGKILL to it(maybe a monitor find the process
       exceed its memory limit and try to kill it).  In the old
       implementation, the SIGKILL won't be handled until the get_user_pages()
       returns.
    
    2.  change the return value to be ERESTARTSYS.  It makes no sense to
       return ENOMEM if the get_user_pages returned by getting a SIGKILL
       signal.  Considering the general convention for a system call
       interrupted by a signal is ERESTARTNOSYS, so the current return value
       is consistant to that.
    
    Lee:
    
    An unfortunate side effect of "make-get_user_pages-interruptible" is that
    it prevents a SIGKILL'd task from munlock-ing pages that it had mlocked,
    resulting in freeing of mlocked pages.  Freeing of mlocked pages, in
    itself, is not so bad.  We just count them now--altho' I had hoped to
    remove this stat and add PG_MLOCKED to the free pages flags check.
    
    However, consider pages in shared libraries mapped by more than one task
    that a task mlocked--e.g., via mlockall().  If the task that mlocked the
    pages exits via SIGKILL, these pages would be left mlocked and
    unevictable.
    
    Proposed fix:
    
    Add another GUP flag to ignore sigkill when calling get_user_pages from
    munlock()--similar to Kosaki Motohiro's 'IGNORE_VMA_PERMISSIONS flag for
    the same purpose.  We are not actually allocating memory in this case,
    which "make-get_user_pages-interruptible" intends to avoid.  We're just
    munlocking pages that are already resident and mapped, and we're reusing
    get_user_pages() to access those pages.
    
    ??  Maybe we should combine 'IGNORE_VMA_PERMISSIONS and '_IGNORE_SIGKILL
    into a single flag: GUP_FLAGS_MUNLOCK ???
    
    [Lee.Schermerhorn@hp.com: ignore sigkill in get_user_pages during munlock]
    Signed-off-by: default avatarPaul Menage <menage@google.com>
    Signed-off-by: default avatarYing Han <yinghan@google.com>
    Reviewed-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    Reviewed-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
    Cc: Nick Piggin <nickpiggin@yahoo.com.au>
    Cc: Hugh Dickins <hugh@veritas.com>
    Cc: Oleg Nesterov <oleg@tv-sign.ru>
    Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
    Cc: Rohit Seth <rohitseth@google.com>
    Cc: David Rientjes <rientjes@google.com>
    Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    4779280d