• Johannes Weiner's avatar
    mm: memcontrol: use vmalloc fallback for large kmem memcg arrays · f80c7dab
    Johannes Weiner authored
    For quick per-memcg indexing, slab caches and list_lru structures
    maintain linear arrays of descriptors.  As the number of concurrent
    memory cgroups in the system goes up, this requires large contiguous
    allocations (8k cgroups = order-5, 16k cgroups = order-6 etc.) for every
    existing slab cache and list_lru, which can easily fail on loaded
    systems.  E.g.:
    
      mkdir: page allocation failure: order:5, mode:0x14040c0(GFP_KERNEL|__GFP_COMP), nodemask=(null)
      CPU: 1 PID: 6399 Comm: mkdir Not tainted 4.13.0-mm1-00065-g720bbe532b7c-dirty #481
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014
      Call Trace:
       ? __alloc_pages_direct_compact+0x4c/0x110
       __alloc_pages_nodemask+0xf50/0x1430
       alloc_pages_current+0x60/0xc0
       kmalloc_order_trace+0x29/0x1b0
       __kmalloc+0x1f4/0x320
       memcg_update_all_list_lrus+0xca/0x2e0
       mem_cgroup_css_alloc+0x612/0x670
       cgroup_apply_control_enable+0x19e/0x360
       cgroup_mkdir+0x322/0x490
       kernfs_iop_mkdir+0x55/0x80
       vfs_mkdir+0xd0/0x120
       SyS_mkdirat+0x6c/0xe0
       SyS_mkdir+0x14/0x20
       entry_SYSCALL_64_fastpath+0x18/0xad
      Mem-Info:
      active_anon:2965 inactive_anon:19 isolated_anon:0
       active_file:100270 inactive_file:98846 isolated_file:0
       unevictable:0 dirty:0 writeback:0 unstable:0
       slab_reclaimable:7328 slab_unreclaimable:16402
       mapped:771 shmem:52 pagetables:278 bounce:0
       free:13718 free_pcp:0 free_cma:0
    
    This output is from an artificial reproducer, but we have repeatedly
    observed order-7 failures in production in the Facebook fleet.  These
    systems become useless as they cannot run more jobs, even though there
    is plenty of memory to allocate 128 individual pages.
    
    Use kvmalloc and kvzalloc to fall back to vmalloc space if these arrays
    prove too large for allocating them physically contiguous.
    
    Link: http://lkml.kernel.org/r/20170918184919.20644-1-hannes@cmpxchg.orgSigned-off-by: 's avatarJohannes Weiner <hannes@cmpxchg.org>
    Reviewed-by: 's avatarJosef Bacik <jbacik@fb.com>
    Acked-by: 's avatarMichal Hocko <mhocko@suse.com>
    Acked-by: 's avatarVladimir Davydov <vdavydov.dev@gmail.com>
    Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
    f80c7dab