Skip to content
  • Michal Hocko's avatar
    mm: introduce memalloc_nofs_{save,restore} API · 7dea19f9
    Michal Hocko authored
    GFP_NOFS context is used for the following 5 reasons currently:
    
     - to prevent from deadlocks when the lock held by the allocation
       context would be needed during the memory reclaim
    
     - to prevent from stack overflows during the reclaim because the
       allocation is performed from a deep context already
    
     - to prevent lockups when the allocation context depends on other
       reclaimers to make a forward progress indirectly
    
     - just in case because this would be safe from the fs POV
    
     - silence lockdep false positives
    
    Unfortunately overuse of this allocation context brings some problems to
    the MM.  Memory reclaim is much weaker (especially during heavy FS
    metadata workloads), OOM killer cannot be invoked because the MM layer
    doesn't have enough information about how much memory is freeable by the
    FS layer.
    
    In many cases it is far from clear why the weaker context is even used
    and so it might be used unnecessarily.  We would like to get rid of
    those as much as possible.  One way to do that is to use the flag in
    scopes rather than isolated cases.  Such a scope is declared when really
    necessary, tracked per task and all the allocation requests from within
    the context will simply inherit the GFP_NOFS semantic.
    
    Not only this is easier to understand and maintain because there are
    much less problematic contexts than specific allocation requests, this
    also helps code paths where FS layer interacts with other layers (e.g.
    crypto, security modules, MM etc...) and there is no easy way to convey
    the allocation context between the layers.
    
    Introduce memalloc_nofs_{save,restore} API to control the scope of
    GFP_NOFS allocation context.  This is basically copying
    memalloc_noio_{save,restore} API we have for other restricted allocation
    context GFP_NOIO.  The PF_MEMALLOC_NOFS flag already exists and it is
    just an alias for PF_FSTRANS which has been xfs specific until recently.
    There are no more PF_FSTRANS users anymore so let's just drop it.
    
    PF_MEMALLOC_NOFS is now checked in the MM layer and drops __GFP_FS
    implicitly same as PF_MEMALLOC_NOIO drops __GFP_IO.  memalloc_noio_flags
    is renamed to current_gfp_context because it now cares about both
    PF_MEMALLOC_NOFS and PF_MEMALLOC_NOIO contexts.  Xfs code paths preserve
    their semantic.  kmem_flags_convert() doesn't need to evaluate the flag
    anymore.
    
    This patch shouldn't introduce any functional changes.
    
    Let's hope that filesystems will drop direct GFP_NOFS (resp.  ~__GFP_FS)
    usage as much as possible and only use a properly documented
    memalloc_nofs_{save,restore} checkpoints where they are appropriate.
    
    [akpm@linux-foundation.org: fix comment typo, reflow comment]
    Link: http://lkml.kernel.org/r/20170306131408.9828-5-mhocko@kernel.org
    
    
    Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
    Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
    Cc: Dave Chinner <david@fromorbit.com>
    Cc: Theodore Ts'o <tytso@mit.edu>
    Cc: Chris Mason <clm@fb.com>
    Cc: David Sterba <dsterba@suse.cz>
    Cc: Jan Kara <jack@suse.cz>
    Cc: Brian Foster <bfoster@redhat.com>
    Cc: Darrick J. Wong <darrick.wong@oracle.com>
    Cc: Nikolay Borisov <nborisov@suse.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    7dea19f9