Skip to content
  • Daniel Bristot de Oliveira's avatar
    rcu: sysctl: Panic on RCU Stall · 088e9d25
    Daniel Bristot de Oliveira authored
    
    
    It is not always easy to determine the cause of an RCU stall just by
    analysing the RCU stall messages, mainly when the problem is caused
    by the indirect starvation of rcu threads. For example, when preempt_rcu
    is not awakened due to the starvation of a timer softirq.
    
    We have been hard coding panic() in the RCU stall functions for
    some time while testing the kernel-rt. But this is not possible in
    some scenarios, like when supporting customers.
    
    This patch implements the sysctl kernel.panic_on_rcu_stall. If
    set to 1, the system will panic() when an RCU stall takes place,
    enabling the capture of a vmcore. The vmcore provides a way to analyze
    all kernel/tasks states, helping out to point to the culprit and the
    solution for the stall.
    
    The kernel.panic_on_rcu_stall sysctl is disabled by default.
    
    Changes from v1:
    - Fixed a typo in the git log
    - The if(sysctl_panic_on_rcu_stall) panic() is in a static function
    - Fixed the CONFIG_TINY_RCU compilation issue
    - The var sysctl_panic_on_rcu_stall is now __read_mostly
    
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
    Cc: Josh Triplett <josh@joshtriplett.org>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Cc: Lai Jiangshan <jiangshanlai@gmail.com>
    Acked-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
    Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
    Reviewed-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
    Tested-by: default avatar"Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
    Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
    Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
    088e9d25