Skip to content
  • Zhu Yi's avatar
    lockdep: fix invalid list_del_rcu in zap_class · 74870172
    Zhu Yi authored
    
    
    The problem is found during iwlagn driver testing on
    v2.6.27-rc4-176-gb8e6c91 kernel, but it turns out to be a lockdep bug.
    In our testing, we frequently load and unload the iwlagn driver
    (>50 times). Then the MAX_STACK_TRACE_ENTRIES is reached (expected
    behaviour?). The error message with the call trace is as below.
    
    BUG: MAX_STACK_TRACE_ENTRIES too low!
    turning off the locking correctness validator.
    Pid: 4895, comm: iwlagn Not tainted 2.6.27-rc4 #13
    
    Call Trace:
     [<ffffffff81014aa1>] save_stack_trace+0x22/0x3e
     [<ffffffff8105390a>] save_trace+0x8b/0x91
     [<ffffffff81054e60>] mark_lock+0x1b0/0x8fa
     [<ffffffff81056f71>] __lock_acquire+0x5b9/0x716
     [<ffffffffa00d818a>] ieee80211_sta_work+0x0/0x6ea [mac80211]
     [<ffffffff81057120>] lock_acquire+0x52/0x6b
     [<ffffffff81045f0e>] run_workqueue+0x97/0x1ed
     [<ffffffff81045f5e>] run_workqueue+0xe7/0x1ed
     [<ffffffff81045f0e>] run_workqueue+0x97/0x1ed
     [<ffffffff81046ae4>] worker_thread+0xd8/0xe3
     [<ffffffff81049503>] autoremove_wake_function+0x0/0x2e
     [<ffffffff81046a0c>] worker_thread+0x0/0xe3
     [<ffffffff810493ec>] kthread+0x47/0x73
     [<ffffffff8128e3ab>] trace_hardirqs_on_thunk+0x3a/0x3f
     [<ffffffff8100cea9>] child_rip+0xa/0x11
     [<ffffffff8100c4df>] restore_args+0x0/0x30
     [<ffffffff810316e1>] finish_task_switch+0x0/0xcc
     [<ffffffff810493a5>] kthread+0x0/0x73
     [<ffffffff8100ce9f>] child_rip+0x0/0x11
    
    Although the above is harmless, when the ilwagn module is removed
    later lockdep will trigger a kernel oops as below.
    
    BUG: unable to handle kernel NULL pointer dereference at
    0000000000000008
    IP: [<ffffffff810531e1>] zap_class+0x24/0x82
    PGD 73128067 PUD 7448c067 PMD 0
    Oops: 0002 [1] SMP
    CPU 0
    Modules linked in: rfcomm l2cap bluetooth autofs4 sunrpc
    nf_conntrack_ipv6 xt_state nf_conntrack xt_tcpudp ip6t_ipv6header
    ip6t_REJECT ip6table_filter ip6_tables x_tables ipv6 cpufreq_ondemand
    acpi_cpufreq dm_mirror dm_log dm_multipath dm_mod snd_hda_intel sr_mod
    snd_seq_dummy snd_seq_oss snd_seq_midi_event battery snd_seq
    snd_seq_device cdrom button snd_pcm_oss snd_mixer_oss snd_pcm
    snd_timer snd_page_alloc e1000e snd_hwdep sg iTCO_wdt
    iTCO_vendor_support ac pcspkr i2c_i801 i2c_core snd soundcore video
    output ata_piix ata_generic libata sd_mod scsi_mod ext3 jbd mbcache
    uhci_hcd ohci_hcd ehci_hcd [last unloaded: mac80211]
    Pid: 4941, comm: modprobe Not tainted 2.6.27-rc4 #10
    RIP: 0010:[<ffffffff810531e1>]  [<ffffffff810531e1>]
    zap_class+0x24/0x82
    RSP: 0000:ffff88007bcb3eb0  EFLAGS: 00010046
    RAX: 0000000000068ee8 RBX: ffffffff8192a0a0 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000001dfb RDI: ffffffff816e70b0
    RBP: ffffffffa00cd000 R08: ffffffff816818f8 R09: ffff88007c923558
    R10: ffffe20002ad2408 R11: ffffffff811028ec R12: ffffffff8192a0a0
    R13: 000000000002bd90 R14: 0000000000000000 R15: 0000000000000296
    FS:  00007f9d1cee56f0(0000) GS:ffffffff814a58c0(0000)
    knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 0000000000000008 CR3: 0000000073047000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process modprobe (pid: 4941, threadinfo ffff88007bcb2000, task
    ffff8800758d1fc0)
    Stack:  ffffffff81057376 0000000000000000 ffffffffa00f7b00
    0000000000000000
     0000000000000080 0000000000618278 00007fff24f16720 0000000000000000
     ffffffff8105d37a ffffffffa00f7b00 ffffffff8105d591 313132303863616d
    Call Trace:
     [<ffffffff81057376>] ? lockdep_free_key_range+0x61/0xf5
     [<ffffffff8105d37a>] ? free_module+0xd4/0xe4
     [<ffffffff8105d591>] ? sys_delete_module+0x1de/0x1f9
     [<ffffffff8106dbfa>] ? audit_syscall_entry+0x12d/0x160
     [<ffffffff8100be2b>] ? system_call_fastpath+0x16/0x1b
    
    Code: b2 00 01 00 00 00 c3 31 f6 49 c7 c0 10 8a 61 81 eb 32 49 39 38
    75 26 48 98 48 6b c0 38 48 8b 90 08 8a 61 81 48 8b 88 00 8a 61 81 <48>
    89 51 08 48 89 0a 48 c7 80 08 8a 61 81 00 02 20 00 48 ff c6
    RIP  [<ffffffff810531e1>] zap_class+0x24/0x82
     RSP <ffff88007bcb3eb0>
    CR2: 0000000000000008
    ---[ end trace a1297e0c4abb0f2e ]---
    
    The root cause for this oops is in add_lock_to_list() when
    save_trace() fails due to MAX_STACK_TRACE_ENTRIES is reached,
    entry->class is assigned but entry is never added into any lock list.
    This makes the list_del_rcu() in zap_class() oops later when the
    module is unloaded. This patch fixes the problem by assigning
    entry->class after save_trace() returns success.
    
    Signed-off-by: default avatarZhu Yi <yi.zhu@intel.com>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    74870172