Skip to content
  • Shaohua Li's avatar
    md/raid5: fix a race condition in stripe batch · 3664847d
    Shaohua Li authored
    
    
    We have a race condition in below scenario, say have 3 continuous stripes, sh1,
    sh2 and sh3, sh1 is the stripe_head of sh2 and sh3:
    
    CPU1				CPU2				CPU3
    handle_stripe(sh3)
    				stripe_add_to_batch_list(sh3)
    				-> lock(sh2, sh3)
    				-> lock batch_lock(sh1)
    				-> add sh3 to batch_list of sh1
    				-> unlock batch_lock(sh1)
    								clear_batch_ready(sh1)
    								-> lock(sh1) and batch_lock(sh1)
    								-> clear STRIPE_BATCH_READY for all stripes in batch_list
    								-> unlock(sh1) and batch_lock(sh1)
    ->clear_batch_ready(sh3)
    -->test_and_clear_bit(STRIPE_BATCH_READY, sh3)
    --->return 0 as sh->batch == NULL
    				-> sh3->batch_head = sh1
    				-> unlock (sh2, sh3)
    
    In CPU1, handle_stripe will continue handle sh3 even it's in batch stripe list
    of sh1. By moving sh3->batch_head assignment in to batch_lock, we make it
    impossible to clear STRIPE_BATCH_READY before batch_head is set.
    
    Thanks Stephane for helping debug this tricky issue.
    
    Reported-and-tested-by: default avatarStephane Thiell <sthiell@stanford.edu>
    Cc: stable@vger.kernel.org (v4.1+)
    Signed-off-by: default avatarShaohua Li <shli@fb.com>
    3664847d