• Jiri Slaby's avatar
    memcg: make it work on sparse non-0-node systems · 7a47d187
    Jiri Slaby authored
    commit 3e8589963773a5c23e2f1fe4bcad0e9a90b7f471 upstream.
    
    We have a single node system with node 0 disabled:
      Scanning NUMA topology in Northbridge 24
      Number of physical nodes 2
      Skipping disabled node 0
      Node 1 MemBase 0000000000000000 Limit 00000000fbff0000
      NODE_DATA(1) allocated [mem 0xfbfda000-0xfbfeffff]
    
    This causes crashes in memcg when system boots:
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      #PF error: [normal kernel read fault]
    ...
      RIP: 0010:list_lru_add+0x94/0x170
    ...
      Call Trace:
       d_lru_add+0x44/0x50
       dput.part.34+0xfc/0x110
       __fput+0x108/0x230
       task_work_run+0x9f/0xc0
       exit_to_usermode_loop+0xf5/0x100
    
    It is reproducible as far as 4.12.  I did not try older kernels.  You have
    to have a new enough systemd, e.g.  241 (the reason is unknown -- was not
    investigated).  Cannot be reproduced with systemd 234.
    
    The system crashes because the size of lru array is never updated in
    memcg_update_all_list_lrus and the reads are past the zero-sized array,
    causing dereferences of random memory.
    
    The root cause are list_lru_memcg_aware checks in the list_lru code.  The
    test in list_lru_memcg_aware is broken: it assumes node 0 is always
    present, but it is not true on some systems as can be seen above.
    
    So fix this by avoiding checks on node 0.  Remember the memcg-awareness by
    a bool flag in struct list_lru.
    
    Link: http://lkml.kernel.org/r/20190522091940.3615-1-jslaby@suse.cz
    Fixes: 60d3fd32 ("list_lru: introduce per-memcg lists")
    Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Suggested-by: default avatarVladimir Davydov <vdavydov.dev@gmail.com>
    Acked-by: default avatarVladimir Davydov <vdavydov.dev@gmail.com>
    Reviewed-by: default avatarShakeel Butt <shakeelb@google.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    7a47d187
Name
Last commit
Last update
..
kasan Loading commit data...
Kconfig Loading commit data...
Kconfig.debug Loading commit data...
Makefile Loading commit data...
backing-dev.c Loading commit data...
balloon_compaction.c Loading commit data...
bootmem.c Loading commit data...
cleancache.c Loading commit data...
cma.c Loading commit data...
cma.h Loading commit data...
cma_debug.c Loading commit data...
compaction.c Loading commit data...
debug-pagealloc.c Loading commit data...
debug.c Loading commit data...
dmapool.c Loading commit data...
early_ioremap.c Loading commit data...
fadvise.c Loading commit data...
failslab.c Loading commit data...
filemap.c Loading commit data...
frame_vector.c Loading commit data...
frontswap.c Loading commit data...
gup.c Loading commit data...
highmem.c Loading commit data...
huge_memory.c Loading commit data...
hugetlb.c Loading commit data...
hugetlb_cgroup.c Loading commit data...
hwpoison-inject.c Loading commit data...
init-mm.c Loading commit data...
internal.h Loading commit data...
interval_tree.c Loading commit data...
kmemcheck.c Loading commit data...
kmemleak-test.c Loading commit data...
kmemleak.c Loading commit data...
ksm.c Loading commit data...
list_lru.c Loading commit data...
maccess.c Loading commit data...
madvise.c Loading commit data...
memblock.c Loading commit data...
memcontrol.c Loading commit data...
memory-failure.c Loading commit data...
memory.c Loading commit data...
memory_hotplug.c Loading commit data...
mempolicy.c Loading commit data...
mempool.c Loading commit data...
memtest.c Loading commit data...
migrate.c Loading commit data...
mincore.c Loading commit data...
mlock.c Loading commit data...
mm_init.c Loading commit data...
mmap.c Loading commit data...
mmu_context.c Loading commit data...
mmu_notifier.c Loading commit data...
mmzone.c Loading commit data...
mprotect.c Loading commit data...
mremap.c Loading commit data...
msync.c Loading commit data...
nobootmem.c Loading commit data...
nommu.c Loading commit data...
oom_kill.c Loading commit data...
page-writeback.c Loading commit data...
page_alloc.c Loading commit data...
page_counter.c Loading commit data...
page_ext.c Loading commit data...
page_idle.c Loading commit data...
page_io.c Loading commit data...
page_isolation.c Loading commit data...
page_owner.c Loading commit data...
pagewalk.c Loading commit data...
percpu-km.c Loading commit data...
percpu-vm.c Loading commit data...
percpu.c Loading commit data...
pgtable-generic.c Loading commit data...
process_vm_access.c Loading commit data...
quicklist.c Loading commit data...
readahead.c Loading commit data...
rmap.c Loading commit data...
shmem.c Loading commit data...
slab.c Loading commit data...
slab.h Loading commit data...
slab_common.c Loading commit data...
slob.c Loading commit data...
slub.c Loading commit data...
sparse-vmemmap.c Loading commit data...
sparse.c Loading commit data...
swap.c Loading commit data...
swap_cgroup.c Loading commit data...
swap_state.c Loading commit data...
swapfile.c Loading commit data...
truncate.c Loading commit data...
userfaultfd.c Loading commit data...
util.c Loading commit data...
vmacache.c Loading commit data...
vmalloc.c Loading commit data...
vmpressure.c Loading commit data...
vmscan.c Loading commit data...
vmstat.c Loading commit data...
workingset.c Loading commit data...
zbud.c Loading commit data...
zpool.c Loading commit data...
zsmalloc.c Loading commit data...
zswap.c Loading commit data...