On Wed, May 18, 2022 at 07:43:10PM +0800, Zqiang wrote: > This commit adds a "D" indicator to expedited RCU CPU stall warnings. > when an expedited grace period begins, due to CPU disable interrupt > time too long, cause the IPI(rcu_exp_handler()) unable to respond in > time, this debugging id will be showed. > > runqemu kvm slirp nographic qemuparams="-m 4096 -smp 4" bootparams= > "isolcpus=2,3 nohz_full=2,3 rcu_nocbs=2,3 rcutree.dump_tree=1 > rcutorture.stall_cpu_holdoff=30 rcutorture.stall_cpu=40 > rcutorture.stall_cpu_irqsoff=1 rcutorture.stall_cpu_block=0 > rcutorture.stall_no_softlockup=1" -d > > rcu_torture_stall start on CPU 1. > ............ > rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: > { 1-...D } 26467 jiffies s: 13317 root: 0x1/. > rcu: blocking rcu_node structures (internal RCU debug): l=1:0-1:0x2/. > Task dump for CPU 1: > task:rcu_torture_sta state:R running task stack: 0 pid: 76 > ppid: 2 flags:0x00004008 > > Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx> Nice!!! I have queued this for v5.20 and for further testing and review, thank you! As usual, I could not resist the temptation to wordsmith the commit log, so could you please check it in case I messed something up? Thanx, Paul ------------------------------------------------------------------------ commit 178b9d47f3049e8122738c3166ee4975b75cba55 Author: Zqiang <qiang1.zhang@xxxxxxxxx> Date: Wed May 18 19:43:10 2022 +0800 rcu: Add irqs-disabled indicator to expedited RCU CPU stall warnings If a CPU has interrupts disabled continuously starting before the beginning of a given expedited RCU grace period, that CPU will not execute that grace period's IPI handler. This will in turn mean that the ->cpu_no_qs.b.exp field in that CPU's rcu_data structure will continue to contain the boolean value false. Knowing whether or not a CPU has had interrupts disabled can be helpful when debugging an expedited RCU CPU stall warning, so this commit adds a "D" indicator expedited RCU CPU stall warnings that signifies that the corresponding CPU has had interrupts disabled throughout. This capability was tested as follows: runqemu kvm slirp nographic qemuparams="-m 4096 -smp 4" bootparams= "isolcpus=2,3 nohz_full=2,3 rcu_nocbs=2,3 rcutree.dump_tree=1 rcutorture.stall_cpu_holdoff=30 rcutorture.stall_cpu=40 rcutorture.stall_cpu_irqsoff=1 rcutorture.stall_cpu_block=0 rcutorture.stall_no_softlockup=1" -d The rcu_torture_stall() function ran on CPU 1, which displays the "D" as expected given the rcutorture.stall_cpu_irqsoff=1 module parameter: ............ rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 1-...D } 26467 jiffies s: 13317 root: 0x1/. rcu: blocking rcu_node structures (internal RCU debug): l=1:0-1:0x2/. Task dump for CPU 1: task:rcu_torture_sta state:R running task stack: 0 pid: 76 ppid: 2 flags:0x00004008 Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index 4c7037b507032..f092c7f18a5f3 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -637,10 +637,11 @@ static void synchronize_rcu_expedited_wait(void) continue; ndetected++; rdp = per_cpu_ptr(&rcu_data, cpu); - pr_cont(" %d-%c%c%c", cpu, + pr_cont(" %d-%c%c%c%c", cpu, "O."[!!cpu_online(cpu)], "o."[!!(rdp->grpmask & rnp->expmaskinit)], - "N."[!!(rdp->grpmask & rnp->expmaskinitnext)]); + "N."[!!(rdp->grpmask & rnp->expmaskinitnext)], + "D."[!!(rdp->cpu_no_qs.b.exp)]); } } pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",