Skip to content
  • Lans Zhang's avatar
    i7core_edac: fix kernel crash on unloading i7core_edac. · 1c069100
    Lans Zhang authored
    
    
    It is easy to trigger this crash on 3.7.0:
    
    root@intel_westmere_ep-3:~# modprobe -r i7core_edac
    EDAC PCI: Removed device 0 for i7core_edac EDAC PCI controller: DEV 0000:fe:03.0
    EDAC MC: Removed device 1 for i7core_edac.c i7 core #1: DEV 0000:fe:03.0
    EDAC PCI: Removed device 1 for i7core_edac EDAC PCI controller: DEV 0000:ff:03.0
    EDAC MC: Removed device 0 for i7core_edac.c i7 core #0: DEV 0000:ff:03.0
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000110
    IP: [<ffffffff82069ee9>] __blocking_notifier_call_chain+0x29/0x80
    PGD 1eaae7067 PUD 1e96e4067 PMD 0
    Oops: 0000 [#1] PREEMPT SMP
    Modules linked in: minix acpi_cpufreq freq_table mperf ioatdma processor edac_core(-) iTCO_wdt coretemp evdev hwmon lpc_ich dca mfd_core crc32c_intel ioapic [last unloaded: i7core_edac]
    CPU 3
    Pid: 1268, comm: modprobe Not tainted 3.7.0-WR5.0.1.0_standard+ #30 Intel Corporation S5520HC/S5520HC
    RIP: 0010:[<ffffffff82069ee9>]  [<ffffffff82069ee9>] __blocking_notifier_call_chain+0x29/0x80
    RSP: 0018:ffff8801eb12de28  EFLAGS: 00010246
    RAX: 0000000000000000 RBX: 00000000000000f0 RCX: 00000000ffffffff
    RDX: ffff88012b452800 RSI: 0000000000000002 RDI: 00000000000000f0
    RBP: ffff8801eb12de68 R08: 0000000000000000 R09: ffffea0004ad1118
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
    R13: ffff8801eb12dee8 R14: ffff88012b452800 R15: 000000000060e518
    FS:  00007f9ea95a9700(0000) GS:ffff8801efc20000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 0000000000000110 CR3: 00000001262f1000 CR4: 00000000000007e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process modprobe (pid: 1268, threadinfo ffff8801eb12c000, task ffff8801e8421690)
    Stack:
      ffff88012c802a00 ffff88012b445ec0 ffff88012c802300 ffff88012b452800
      0000000000000000 ffff8801eb12dee8 000000000060e080 000000000060e518
      ffff8801eb12de78 ffffffff82069f56 ffff8801eb12dea8 ffffffff824ead7c
    Call Trace:
      [<ffffffff82069f56>] blocking_notifier_call_chain+0x16/0x20
      [<ffffffff824ead7c>] device_del+0x3c/0x1d0
      [<ffffffffa00095a8>] edac_mc_sysfs_exit+0x1c/0x2f [edac_core]
      [<ffffffffa000961c>] edac_exit+0x4f/0x56 [edac_core]
      [<ffffffff820a3d2a>] sys_delete_module+0x17a/0x240
      [<ffffffff8212da7c>] ? vm_munmap+0x5c/0x80
      [<ffffffff82877682>] system_call_fastpath+0x16/0x1b
    Code: 90 90 55 48 89 e5 48 83 ec 40 48 89 5d d8 4c 89 65 e0 4c 89 6d e8 4c 89 75 f0 4c 89 7d f8 66 66 66 66 90 31 c0 49 89 d6 48 89 fb <48> 8b 57 20 49 89 f5 41 89 cf 4c 8d 67 20 48 85 d2 74 2c 4c 89
    RIP  [<ffffffff82069ee9>] __blocking_notifier_call_chain+0x29/0x80
      RSP <ffff8801eb12de28>
    CR2: 0000000000000110
    ---[ end trace b69acf12ccad1c0d ]---
    
    Usually, edac_subsys is grabbed one time by pci at initialization.
    But edac_subsys may be released several times if multiple pci MCs exist.
    The fix just makes the operations balanced.
    
    Signed-off-by: default avatarLans Zhang <jia.zhang@windriver.com>
    Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
    1c069100