On Mon, Mar 20, 2023 at 11:05:17PM +0000, Zhang, Qiang1 wrote: > > For kernels built with enable PREEMPT_NONE and CONFIG_DEBUG_ATOMIC_SLEEP, > > running the RCU stall tests. > > > > runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4" > > bootparams="nokaslr console=ttyS0 rcutorture.stall_cpu=30 > > rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1 > > rcutorture.stall_cpu_block=1" -d > > > > [ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall > > [ 10.841073] rcu_torture_stall start on CPU 3. > > [ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000 > > .... > > [ 10.841108] Call Trace: > > [ 10.841110] <TASK> > > [ 10.841112] dump_stack_lvl+0x64/0xb0 > > [ 10.841118] dump_stack+0x10/0x20 > > [ 10.841121] __schedule_bug+0x8b/0xb0 > > [ 10.841126] __schedule+0x2172/0x2940 > > [ 10.841157] schedule+0x9b/0x150 > > [ 10.841160] schedule_timeout+0x2e8/0x4f0 > > [ 10.841192] schedule_timeout_uninterruptible+0x47/0x50 > > [ 10.841195] rcu_torture_stall+0x2e8/0x300 > > [ 10.841199] kthread+0x175/0x1a0 > > [ 10.841206] ret_from_fork+0x2c/0x50 > > > > The above calltrace occurs in the local_irq_disable/enable() critical > > section call schedule_timeout(), and invoke schedule_timeout() also > > implies a quiescent state, of course it also fails to trigger RCU stall, > > this commit therefore use mdelay() instead of schedule_timeout() to > > trigger RCU stall. > > > > Suggested-by: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> > > Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx> > > --- > > kernel/rcu/rcutorture.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c > > index d06c2da04c34..a08a72bef5f1 100644 > > --- a/kernel/rcu/rcutorture.c > > +++ b/kernel/rcu/rcutorture.c > > @@ -2472,7 +2472,7 @@ static int rcu_torture_stall(void *args) > > > >Right here there is: > > > > if (stall_cpu_block) { > > > >In other words, the rcutorture.stall_cpu_block module parameter says to > >block, even if it is a bad thing to do. The point of this is to verify > >the error messages that are supposed to be printed on the console when > >this happens. > > > > #ifdef CONFIG_PREEMPTION > > preempt_schedule(); > > #else > > - schedule_timeout_uninterruptible(HZ); > > + mdelay(jiffies_to_msecs(HZ)); > > > >So this really needs to stay schedule_timeout_uninterruptible(HZ). > > But invoke schedule_timeout_uninterruptible(HZ) implies a quiescent state, > this will not cause an RCU stall to occur, and still in the RCU read critical section(PREEMPT_COUNT=y). > > It didn't happen RCU stall when I tested with the following parameters for > rcutorture.stall_cpu=30 > rcutorture.stall_no_softlockup=1 > rcutorture.stall_cpu_irqsoff=1 > rcutorture.stall_cpu_block=1 Understood. If you want that RCU CPU stall in a CONFIG_PREEMPTION=n kernel, you should not use rcutorture.stall_cpu_block=1. In a CONFIG_PREEMPTION=y kernel, rcutorture.stall_cpu_block=1 forces the grace period to be stalled on a task rather than a CPU, exercising a different part of the RCU CPU stall warning code. In a CONFIG_PREEMPTION=n kernel, using rcutorture.stall_cpu_block=1 forces the CPU to go through a quiescent state, as you say. It can also cause lockdep and scheduling-while-atomic complaints, depending on exactly what type of RCU reader is in effect. So these are test-the-diagnostics parameters. The mdelay() instead makes rcutorture.stall_cpu_block=1 do the same thing as does rcutorture.stall_cpu_block=0 for CONFIG_PREEMPTION=n kernels, right? Thanx, Paul > Thanks > Zqiang > > > > >So should there be a change to kernel-parameters.txt to make it > >more clear that this is intended behavior? > > > > Thanx, Paul > > > > #endif > > } else if (stall_no_softlockup) { > > touch_softlockup_watchdog(); > > -- > > 2.25.1 > >