Skip to content
  • Steven Rostedt's avatar
    powerpc/pasemi: Fix crash on reboot · 72640d88
    Steven Rostedt authored
    commit f96972f2
    
     "kernel/sys.c: call disable_nonboot_cpus() in
    kernel_restart()"
    
    added a call to disable_nonboot_cpus() on kernel_restart(), which tries
    to shutdown all the CPUs except the first one. The issue with the PA
    Semi, is that it does not support CPU hotplug.
    
    When the call is made to __cpu_down(), it calls the notifiers
    CPU_DOWN_PREPARE, and then tries to take the CPU down.
    
    One of the notifiers to the CPU hotplug code, is the cpufreq. The
    DOWN_PREPARE will call __cpufreq_remove_dev() which calls
    cpufreq_driver->exit. The PA Semi exit handler unmaps regions of I/O
    that is used by an interrupt that goes off constantly
    (system_reset_common, but it goes off during normal system operations
    too). I'm not sure exactly what this interrupt does.
    
    Running a simple function trace, you can see it goes off quite a bit:
    
    # tracer: function
    #
    #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
    #              | |       |          |         |
              <idle>-0     [001]  1558.859363: .pasemi_system_reset_exception <-.system_reset_exception
              <idle>-0     [000]  1558.860112: .pasemi_system_reset_exception <-.system_reset_exception
              <idle>-0     [000]  1558.861109: .pasemi_system_reset_exception <-.system_reset_exception
              <idle>-0     [001]  1558.861361: .pasemi_system_reset_exception <-.system_reset_exception
              <idle>-0     [000]  1558.861437: .pasemi_system_reset_exception <-.system_reset_exception
    
    When the region is unmapped, the system crashes with:
    
    Disabling non-boot CPUs ...
    Error taking CPU1 down: -38
    Unable to handle kernel paging request for data at address 0xd0000800903a0100
    Faulting instruction address: 0xc000000000055fcc
    Oops: Kernel access of bad area, sig: 11 [#1]
    PREEMPT SMP NR_CPUS=64 NUMA PA Semi PWRficient
    Modules linked in: shpchp
    NIP: c000000000055fcc LR: c000000000055fb4 CTR: c0000000000df1fc
    REGS: c0000000012175d0 TRAP: 0300   Not tainted  (3.8.0-rc4-test-dirty)
    MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI>  CR: 24000088  XER: 00000000
    SOFTE: 0
    DAR: d0000800903a0100, DSISR: 42000000
    TASK = c0000000010e9008[0] 'swapper/0' THREAD: c000000001214000 CPU: 0
    GPR00: d0000800903a0000 c000000001217850 c0000000012167e0 0000000000000000
    GPR04: 0000000000000000 0000000000000724 0000000000000724 0000000000000000
    GPR08: 0000000000000000 0000000000000000 0000000000000001 0000000000a70000
    GPR12: 0000000024000080 c00000000fff0000 ffffffffffffffff 000000003ffffae0
    GPR16: ffffffffffffffff 0000000000a21198 0000000000000060 0000000000000000
    GPR20: 00000000008fdd35 0000000000a21258 000000003ffffaf0 0000000000000417
    GPR24: 0000000000a226d0 c000000000000000 0000000000000000 0000000000000000
    GPR28: c00000000138b358 0000000000000000 c000000001144818 d0000800903a0100
    NIP [c000000000055fcc] .set_astate+0x5c/0xa4
    LR [c000000000055fb4] .set_astate+0x44/0xa4
    Call Trace:
    [c000000001217850] [c000000000055fb4] .set_astate+0x44/0xa4 (unreliable)
    [c0000000012178f0] [c00000000005647c] .restore_astate+0x2c/0x34
    [c000000001217980] [c000000000054668] .pasemi_system_reset_exception+0x6c/0x88
    [c000000001217a00] [c000000000019ef0] .system_reset_exception+0x48/0x84
    [c000000001217a80] [c000000000001e40] system_reset_common+0x140/0x180
    
    Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
    72640d88