• Andrea Arcangeli's avatar
    coredump: fix race condition between collapse_huge_page() and core dumping · 465ce9a5
    Andrea Arcangeli authored
    commit 59ea6d06cfa9247b586a695c21f94afa7183af74 upstream.
    
    When fixing the race conditions between the coredump and the mmap_sem
    holders outside the context of the process, we focused on
    mmget_not_zero()/get_task_mm() callers in 04f5866e41fb70 ("coredump: fix
    race condition between mmget_not_zero()/get_task_mm() and core
    dumping"), but those aren't the only cases where the mmap_sem can be
    taken outside of the context of the process as Michal Hocko noticed
    while backporting that commit to older -stable kernels.
    
    If mmgrab() is called in the context of the process, but then the
    mm_count reference is transferred outside the context of the process,
    that can also be a problem if the mmap_sem has to be taken for writing
    through that mm_count reference.
    
    khugepaged registration calls mmgrab() in the context of the process,
    but the mmap_sem for writing is taken later in the context of the
    khugepaged kernel thread.
    
    collapse_huge_page() after taking the mmap_sem for writing doesn't
    modify any vma, so it's not obvious that it could cause a problem to the
    coredump, but it happens to modify the pmd in a way that breaks an
    invariant that pmd_trans_huge_lock() relies upon.  collapse_huge_page()
    needs the mmap_sem for writing just to block concurrent page faults that
    call pmd_trans_huge_lock().
    
    Specifically the invariant that "!pmd_trans_huge()" cannot become a
    "pmd_trans_huge()" doesn't hold while collapse_huge_page() runs.
    
    The coredump will call __get_user_pages() without mmap_sem for reading,
    which eventually can invoke a lockless page fault which will need a
    functional pmd_trans_huge_lock().
    
    So collapse_huge_page() needs to use mmget_still_valid() to check it's
    not running concurrently with the coredump...  as long as the coredump
    can invoke page faults without holding the mmap_sem for reading.
    
    This has "Fixes: khugepaged" to facilitate backporting, but in my view
    it's more a bug in the coredump code that will eventually have to be
    rewritten to stop invoking page faults without the mmap_sem for reading.
    So the long term plan is still to drop all mmget_still_valid().
    
    Link: http://lkml.kernel.org/r/20190607161558.32104-1-aarcange@redhat.com
    Fixes: ba76149f ("thp: khugepaged")
    Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
    Reported-by: default avatarMichal Hocko <mhocko@suse.com>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Cc: Jann Horn <jannh@google.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Jason Gunthorpe <jgg@mellanox.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    465ce9a5
Name
Last commit
Last update
..
kasan Loading commit data...
Kconfig Loading commit data...
Kconfig.debug Loading commit data...
Makefile Loading commit data...
backing-dev.c Loading commit data...
balloon_compaction.c Loading commit data...
bootmem.c Loading commit data...
cleancache.c Loading commit data...
cma.c Loading commit data...
cma.h Loading commit data...
cma_debug.c Loading commit data...
compaction.c Loading commit data...
debug.c Loading commit data...
debug_page_ref.c Loading commit data...
dmapool.c Loading commit data...
early_ioremap.c Loading commit data...
fadvise.c Loading commit data...
failslab.c Loading commit data...
filemap.c Loading commit data...
frame_vector.c Loading commit data...
frontswap.c Loading commit data...
gup.c Loading commit data...
gup_benchmark.c Loading commit data...
highmem.c Loading commit data...
hmm.c Loading commit data...
huge_memory.c Loading commit data...
hugetlb.c Loading commit data...
hugetlb_cgroup.c Loading commit data...
hwpoison-inject.c Loading commit data...
init-mm.c Loading commit data...
internal.h Loading commit data...
interval_tree.c Loading commit data...
khugepaged.c Loading commit data...
kmemleak-test.c Loading commit data...
kmemleak.c Loading commit data...
ksm.c Loading commit data...
list_lru.c Loading commit data...
maccess.c Loading commit data...
madvise.c Loading commit data...
memblock.c Loading commit data...
memcontrol.c Loading commit data...
memfd.c Loading commit data...
memory-failure.c Loading commit data...
memory.c Loading commit data...
memory_hotplug.c Loading commit data...
mempolicy.c Loading commit data...
mempool.c Loading commit data...
memtest.c Loading commit data...
migrate.c Loading commit data...
mincore.c Loading commit data...
mlock.c Loading commit data...
mm_init.c Loading commit data...
mmap.c Loading commit data...
mmu_context.c Loading commit data...
mmu_notifier.c Loading commit data...
mmzone.c Loading commit data...
mprotect.c Loading commit data...
mremap.c Loading commit data...
msync.c Loading commit data...
nobootmem.c Loading commit data...
nommu.c Loading commit data...
oom_kill.c Loading commit data...
page-writeback.c Loading commit data...
page_alloc.c Loading commit data...
page_counter.c Loading commit data...
page_ext.c Loading commit data...
page_idle.c Loading commit data...
page_io.c Loading commit data...
page_isolation.c Loading commit data...
page_owner.c Loading commit data...
page_poison.c Loading commit data...
page_vma_mapped.c Loading commit data...
pagewalk.c Loading commit data...
percpu-internal.h Loading commit data...
percpu-km.c Loading commit data...
percpu-stats.c Loading commit data...
percpu-vm.c Loading commit data...
percpu.c Loading commit data...
pgtable-generic.c Loading commit data...
process_vm_access.c Loading commit data...
quicklist.c Loading commit data...
readahead.c Loading commit data...
rmap.c Loading commit data...
rodata_test.c Loading commit data...
shmem.c Loading commit data...
slab.c Loading commit data...
slab.h Loading commit data...
slab_common.c Loading commit data...
slob.c Loading commit data...
slub.c Loading commit data...
sparse-vmemmap.c Loading commit data...
sparse.c Loading commit data...
swap.c Loading commit data...
swap_cgroup.c Loading commit data...
swap_slots.c Loading commit data...
swap_state.c Loading commit data...
swapfile.c Loading commit data...
truncate.c Loading commit data...
usercopy.c Loading commit data...
userfaultfd.c Loading commit data...
util.c Loading commit data...
vmacache.c Loading commit data...
vmalloc.c Loading commit data...
vmpressure.c Loading commit data...
vmscan.c Loading commit data...
vmstat.c Loading commit data...
workingset.c Loading commit data...
z3fold.c Loading commit data...
zbud.c Loading commit data...
zpool.c Loading commit data...
zsmalloc.c Loading commit data...
zswap.c Loading commit data...