Re: RCU stalls with TREE07 on v6.0 kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Dec 31, 2022 at 4:07 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
>
> On Sat, Dec 31, 2022 at 12:28:10AM -0500, Joel Fernandes wrote:
> > On Fri, Dec 30, 2022 at 9:14 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> > >
> > > On Fri, Dec 30, 2022 at 08:42:24PM -0500, Joel Fernandes wrote:
> > > > Hello,
> > > >
> > > > I have been firefighting a hang on 6.0.y stable kernels with
> > > > rcutorture. It happens mostly consistently when TREE07 is shutting
> > > > down.
> > > >
> > > > It appears that the RCU torture threads are attempted to stop but the
> > > > shutdown thread, but are constantly awakened by a timer softirq
> > > > handler in ksoftirqd context. When they wake up, they immediately goto
> > > > sleep in uninterruptible state until the next time a timer handler
> > > > wakes them up. It appears the timer softirq is long enough to cause
> > > > RCU stalls and I see it calling 100s of timer function handlers
> > > > (call_timer_fn).
> > > >
> > > > I am doing some more investigation with trace_printk(s):
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=stable/trace-hang-6.0.y&id=b779b1e92c97f29333a282ee8f548da02f64de2b
> > > >
> > > > Regarding the timer handlers, I was wondering if it is possible that a
> > > > large number of timer handlers constantly queued can cause RCU stalls
> > > > due to the timer softirq taking a very long time. That certainly
> > > > appears to be the case here. Shouldn't the timer softirq also do
> > > > rcu_softirq_qs() similar to the ksoftirq loop, in case there are too
> > > > many of them?
> > >
> > > That is certainly a good thing to try!
> >
> > I am trying something like this just for testing, let's see what happens ;-)
> >
> > @@ -1788,9 +1796,14 @@ static inline void __run_timers(struct timer_base *base)
> >
> >                 while (levels--)
> >                         expire_timers(base, heads + levels);
>
> Might you need to put it inside the body of this "while" loop?

I could but even doing the below did not help. Do you see any reason
putting inside the loop helps? Let me know.

> > +               rcu_softirq_qs();
> >         }
> >
> > I guess I am also wondering why the rcu reader does not stop queuing
> > timers. It is doing schedule_timeout_interruptible() constantly even
> > though the test is being stopped.
>
> As in it is doing schedule_timeout_interruptible() in a loop that does
> not check torture_must_stop()?  That would be bad unless there is some
> reason the loop cannot just stop.  But if the loop cannot just stop,
> that leads me to believe that there is a bug elsewhere.  After all, given
> that the test is ending, why shouldn't all the loops be able to just stop?

Sorry I meant uninterruptible sleep. So
schedule_timeout_uninterruptible(). But I still cannot nail where in
the torture test it is coming from. Now I put a trace_dump_stack()
directly in the schedule loop() and I am hoping it will show:

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c2bb0cd4225d5..d79bc3b91818a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6390,6 +6390,8 @@ pick_next_task(struct rq *rq, struct task_struct
*prev, struct rq_flags *rf)
*
* WARNING: must be called with preemption disabled!
*/
+static int rcu_tort_ds;
+
static void __sched notrace __schedule(unsigned int sched_mode)
{
struct task_struct *prev, *next;
@@ -6452,6 +6454,11 @@ static void __sched notrace __schedule(unsigned
int sched_mode)
if (prev->sched_contributes_to_load)
rq->nr_uninterruptible++;
+ // Dump stack for uninterruptible every 10 stacks.
+   if (prev_state & TASK_UNINTERRUPTIBLE && ((rcu_tort_ds++ % 10) == 0)) {
+       trace_dump_stack(0);
+ }
+
/*
* __schedule() ttwu()
* prev_state = prev->state; if (p->on_rq && ...

I agree with you, it could be that the loops in the test stop but
something else in the kernel is looping (via an API that the test
called), which is doing uninterruptible sleep loops. Weird indeed.

> (Maybe it is waiting on a buggy child task, for example.)

Hmm, it is the kernel threads (rcu_torture_read,
rcu_torture_fakewriter etc) though, not userspace, that are
waking/sleeping.

keep you posted, thanks,

-  Joel



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux