• NeilBrown's avatar
    fscache: fix race between enablement and dropping of object · 38026d1a
    NeilBrown authored
    [ Upstream commit c5a94f434c82529afda290df3235e4d85873c5b4 ]
    It was observed that a process blocked indefintely in
    __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP
    to be cleared via fscache_wait_for_deferred_lookup().
    At this time, ->backing_objects was empty, which would normaly prevent
    __fscache_read_or_alloc_page() from getting to the point of waiting.
    This implies that ->backing_objects was cleared *after*
    __fscache_read_or_alloc_page was was entered.
    When an object is "killed" and then "dropped",
    FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then
    KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is
    ->backing_objects cleared.  This leaves a window where
    something else can set FSCACHE_COOKIE_LOOKING_UP and
    __fscache_read_or_alloc_page() can start waiting, before
    ->backing_objects is cleared
    There is some uncertainty in this analysis, but it seems to be fit the
    observations.  Adding the wake in this patch will be handled correctly
    by __fscache_read_or_alloc_page(), as it checks if ->backing_objects
    is empty again, after waiting.
    Customer which reported the hang, also report that the hang cannot be
    reproduced with this fix.
    The backtrace for the blocked process looked like:
    PID: 29360  TASK: ffff881ff2ac0f80  CPU: 3   COMMAND: "zsh"
     #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1
     #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed
     #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8
     #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e
     #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache]
     #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache]
     #6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs]
     #7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs]
     #8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73
     #9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs]
    #10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756
    #11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa
    #12 [ffff881ff43eff18] sys_read at ffffffff811fda62
    #13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e
    Signed-off-by: default avatarNeilBrown <neilb@suse.com>
    Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
    Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
Last commit
Last update
Kconfig Loading commit data...
Makefile Loading commit data...
cache.c Loading commit data...
cookie.c Loading commit data...
fsdef.c Loading commit data...
histogram.c Loading commit data...
internal.h Loading commit data...
main.c Loading commit data...
netfs.c Loading commit data...
object-list.c Loading commit data...
object.c Loading commit data...
operation.c Loading commit data...
page.c Loading commit data...
proc.c Loading commit data...
stats.c Loading commit data...