Skip to content
  • Tejun Heo's avatar
    block: don't call elevator callbacks for plug merges · 07c2bd37
    Tejun Heo authored
    
    
    Plug merge calls two elevator callbacks outside queue lock -
    elevator_allow_merge_fn() and elevator_bio_merged_fn().  Although
    attempt_plug_merge() suggests that elevator is guaranteed to be there
    through the existing request on the plug list, nothing prevents plug
    merge from calling into dying or initializing elevator.
    
    For regular merges, bypass ensures elvpriv count to reach zero, which
    in turn prevents merges as all !ELVPRIV requests get REQ_SOFTBARRIER
    from forced back insertion.  Plug merge doesn't check ELVPRIV, and, as
    the requests haven't gone through elevator insertion yet, it doesn't
    have SOFTBARRIER set allowing merges on a bypassed queue.
    
    This, for example, leads to the following crash during elevator
    switch.
    
     BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
     IP: [<ffffffff813b34e9>] cfq_allow_merge+0x49/0xa0
     PGD 112cbc067 PUD 115d5c067 PMD 0
     Oops: 0000 [#1] PREEMPT SMP
     CPU 1
     Modules linked in: deadline_iosched
    
     Pid: 819, comm: dd Not tainted 3.3.0-rc2-work+ #76 Bochs Bochs
     RIP: 0010:[<ffffffff813b34e9>]  [<ffffffff813b34e9>] cfq_allow_merge+0x49/0xa0
     RSP: 0018:ffff8801143a38f8  EFLAGS: 00010297
     RAX: 0000000000000000 RBX: ffff88011817ce28 RCX: ffff880116eb6cc0
     RDX: 0000000000000000 RSI: ffff880118056e20 RDI: ffff8801199512f8
     RBP: ffff8801143a3908 R08: 0000000000000000 R09: 0000000000000000
     R10: 0000000000000001 R11: 0000000000000000 R12: ffff880118195708
     R13: ffff880118052aa0 R14: ffff8801143a3d50 R15: ffff880118195708
     FS:  00007f19f82cb700(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
     CR2: 0000000000000008 CR3: 0000000112c6a000 CR4: 00000000000006e0
     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
     DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
     Process dd (pid: 819, threadinfo ffff8801143a2000, task ffff880116eb6cc0)
     Stack:
      ffff88011817ce28 ffff880118195708 ffff8801143a3928 ffffffff81391bba
      ffff88011817ce28 ffff880118195708 ffff8801143a3948 ffffffff81391bf1
      ffff88011817ce28 0000000000000000 ffff8801143a39a8 ffffffff81398e3e
     Call Trace:
      [<ffffffff81391bba>] elv_rq_merge_ok+0x4a/0x60
      [<ffffffff81391bf1>] elv_try_merge+0x21/0x40
      [<ffffffff81398e3e>] blk_queue_bio+0x8e/0x390
      [<ffffffff81396a5a>] generic_make_request+0xca/0x100
      [<ffffffff81396b04>] submit_bio+0x74/0x100
      [<ffffffff811d45c2>] __blockdev_direct_IO+0x1ce2/0x3450
      [<ffffffff811d0dc7>] blkdev_direct_IO+0x57/0x60
      [<ffffffff811460b5>] generic_file_aio_read+0x6d5/0x760
      [<ffffffff811986b2>] do_sync_read+0xe2/0x120
      [<ffffffff81199345>] vfs_read+0xc5/0x180
      [<ffffffff81199501>] sys_read+0x51/0x90
      [<ffffffff81aeac12>] system_call_fastpath+0x16/0x1b
    
    There are multiple ways to fix this including making plug merge check
    ELVPRIV; however,
    
    * Calling into elevator outside queue lock is confusing and
      error-prone.
    
    * Requests on plug list aren't known to the elevator.  They aren't on
      the elevator yet, so there's no elevator specific state to update.
    
    * Given the nature of plug merges - collecting bio's for the same
      purpose from the same issuer - elevator specific restrictions aren't
      applicable.
    
    So, simply don't call into elevator methods from plug merge by moving
    elv_bio_merged() from bio_attempt_*_merge() to blk_queue_bio(), and
    using blk_try_merge() in attempt_plug_merge().
    
    This is based on Jens' patch to skip elevator_allow_merge_fn() from
    plug merge.
    
    Note that this makes per-cgroup merged stats skip plug merging.
    
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    LKML-Reference: <4F16F3CA.90904@kernel.dk>
    Original-patch-by: default avatarJens Axboe <axboe@kernel.dk>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    07c2bd37