Re: [PATCH v2] rcutorture: Convert schedule_timeout_uninterruptible() to mdelay() in rcu_torture_stall()

"Paul E. McKenney" <paulmck@xxxxxxxxxx> · Mon, 20 Mar 2023 16:35:34 -0700

On Mon, Mar 20, 2023 at 11:05:17PM +0000, Zhang, Qiang1 wrote:
> > For kernels built with enable PREEMPT_NONE and CONFIG_DEBUG_ATOMIC_SLEEP,
> > running the RCU stall tests.
> > 
> > runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4"
> > bootparams="nokaslr console=ttyS0 rcutorture.stall_cpu=30
> > rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1
> > rcutorture.stall_cpu_block=1" -d
> > 
> > [   10.841071] rcu-torture: rcu_torture_stall begin CPU stall
> > [   10.841073] rcu_torture_stall start on CPU 3.
> > [   10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000
> > ....
> > [   10.841108] Call Trace:
> > [   10.841110]  <TASK>
> > [   10.841112]  dump_stack_lvl+0x64/0xb0
> > [   10.841118]  dump_stack+0x10/0x20
> > [   10.841121]  __schedule_bug+0x8b/0xb0
> > [   10.841126]  __schedule+0x2172/0x2940
> > [   10.841157]  schedule+0x9b/0x150
> > [   10.841160]  schedule_timeout+0x2e8/0x4f0
> > [   10.841192]  schedule_timeout_uninterruptible+0x47/0x50
> > [   10.841195]  rcu_torture_stall+0x2e8/0x300
> > [   10.841199]  kthread+0x175/0x1a0
> > [   10.841206]  ret_from_fork+0x2c/0x50
> > 
> > The above calltrace occurs in the local_irq_disable/enable() critical
> > section call schedule_timeout(), and invoke schedule_timeout() also
> > implies a quiescent state, of course it also fails to trigger RCU stall,
> > this commit therefore use mdelay() instead of schedule_timeout() to
> > trigger RCU stall.
> > 
> > Suggested-by: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>
> > Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx>
> > ---
> >  kernel/rcu/rcutorture.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> > index d06c2da04c34..a08a72bef5f1 100644
> > --- a/kernel/rcu/rcutorture.c
> > +++ b/kernel/rcu/rcutorture.c
> > @@ -2472,7 +2472,7 @@ static int rcu_torture_stall(void *args)
> >
> >Right here there is:
> >
> >			if (stall_cpu_block) {
> >
> >In other words, the rcutorture.stall_cpu_block module parameter says to
> >block, even if it is a bad thing to do.  The point of this is to verify
> >the error messages that are supposed to be printed on the console when
> >this happens.
> >
> >  #ifdef CONFIG_PREEMPTION
> >  				preempt_schedule();
> >  #else
> > -				schedule_timeout_uninterruptible(HZ);
> > +				mdelay(jiffies_to_msecs(HZ));
> >
> >So this really needs to stay schedule_timeout_uninterruptible(HZ).
> 
> But invoke schedule_timeout_uninterruptible(HZ) implies a quiescent state, 
> this will not cause an RCU stall to occur, and still in the RCU read critical section(PREEMPT_COUNT=y).
> 
> It didn't happen RCU stall when I tested with the following parameters for 
> rcutorture.stall_cpu=30
> rcutorture.stall_no_softlockup=1
> rcutorture.stall_cpu_irqsoff=1
> rcutorture.stall_cpu_block=1

Understood.  If you want that RCU CPU stall in a CONFIG_PREEMPTION=n
kernel, you should not use rcutorture.stall_cpu_block=1.

In a CONFIG_PREEMPTION=y kernel, rcutorture.stall_cpu_block=1 forces
the grace period to be stalled on a task rather than a CPU, exercising
a different part of the RCU CPU stall warning code.

In a CONFIG_PREEMPTION=n kernel, using rcutorture.stall_cpu_block=1
forces the CPU to go through a quiescent state, as you say.  It can
also cause lockdep and scheduling-while-atomic complaints, depending on
exactly what type of RCU reader is in effect.

So these are test-the-diagnostics parameters.  The mdelay() instead
makes rcutorture.stall_cpu_block=1 do the same thing as does
rcutorture.stall_cpu_block=0 for CONFIG_PREEMPTION=n kernels, right?

							Thanx, Paul

> Thanks
> Zqiang
> 
> >
> >So should there be a change to kernel-parameters.txt to make it
> >more clear that this is intended behavior?
> >
> >						Thanx, Paul
> >
> >  #endif
> >  			} else if (stall_no_softlockup) {
> >  				touch_softlockup_watchdog();
> > -- 
> > 2.25.1
> >