Re: RCU stall query

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 22, 2021 at 04:45:00PM +0100, John Garry wrote:
> On 22/04/2021 15:35, Paul E. McKenney wrote:
> > On Thu, Apr 22, 2021 at 10:20:51AM +0100, John Garry wrote:
> > > Hi RCU experts,
> > > 
> 
> Thanks Paul
> 
> > > Recently I have noticed that I can trigger an RCU stall quite easily on my
> > > system under specific conditions.
> > > 
> > > I have a fair idea why it happens, but need to analyze a proper solution
> > > further. It looks like a hard IRQ handler and threaded part are tied to
> > > specific CPU and getting swamped and not relinquishing.

I should hasten to confirm that saturating a CPU with interrupts can
also result in RCU CPU stall warnings, so please do continue your
efforts fixing this as well.

> > > However, mixed in the RCU splats, I have noticed many BUG logs, like:
> > > 
> > > [  207.788748] BUG: spinlock recursion on CPU#46, fio/1470
> > This is a self-deadlock.  Given that deadlock, and given that spinlocks
> > disable preemption, the RCU CPU stall warnings are expected behavior.
> > After all, your code really is grabbing a CPU by the throat and shaking
> > it indefinitely.
> > 
> > Please build your kernel with CONFIG_PROVE_LOCKING=y and then fix the
> > issues it calls out.  Then please also fix the bugs resulting in the
> > "sleeping function called from invalid context" and in the "scheduling
> > while atomic".
> 
> Here's the rub, the issue goes away with CONFIG_PROVE_LOCKING and all the
> extra debugging it adds. Hmmm.

That can happpen.  You have enough going on that fixing what you already
know about might eventually get things to where CONFIG_PROVE_LOCKING
does something useful to you.

							Thanx, Paul

> But I get the point that these are separate and need to be fixed also.
> 
> > 
> > In addition, there are quite a few idle tasks called out in your list of
> > stalled CPUs.  This is often due to RCU's grace-period kthread (named
> > "rcu_preempt" in this case) not getting any CPU time.  This is not
> > unexpected given the "RT throttling activated".  If you are going to run
> > code at real-time priorities, you must ensure that any number of kernel
> > kthreads get the CPU time that they need.  As Spiderman's uncle said
> > "With great power comes great responsibility".
> 
> OK, I need to check on that separately also.
> 
> Cheers,
> John



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux