Re: [PATCH 4/4] tracing: Remove RCU work arounds from stack tracer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Sep 23, 2017 at 07:22:04AM -0400, Steven Rostedt wrote:
> On Fri, 22 Sep 2017 23:07:37 -0700
> "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> 
> > OK, how about the following?
> > 
> > 	It turns out that functions called from rcu_irq_enter() can
> > 	be subject to various kinds of tracing, which can result in
> > 	rcu_irq_enter() being invoked recursively.  This recursion
> > 	causes RCU to not have been watching when it should have,
> > 	resulting in lockdep-RCU splats.  Switching from rcu_irq_enter()
> > 	to rcu_nmi_enter() still resulted in failures because of calls
> > 	to rcu_irq_enter() between the rcu_nmi_enter() and its matching
> > 	rcu_nmi_exit().  Such calls again cause RCU to not be watching
> > 	when it should have been.
> > 
> > 	In particular, the stack tracer called rcu_irq_enter()
> > 	unconditionally, which is problematic when RCU's last
> > 	not-watching-to-watching transition was carried out by
> > 	rcu_nmi_enter(), as will be the case when tracing uses
> > 	rcu_nmi_enter() to cause RCU to start watching the current CPU.
> > 	In that case, rcu_irq_enter() actually switches RCU back to
> > 	the not-watching state for this CPU, which results in lockdep
> > 	splats complaining about rcu_read_lock() being used on an idle
> > 	(not-watched) CPU.  The first patch of this series addressed
> > 	this problem by having rcu_irq_enter() and rcu_irq_exit()
> > 	refrain from doing anything when rcu_nmi_enter() caused RCU to
> > 	start watching this CPU.  The third patch in this series caused
> > 	save_stack_trace() to invoke rcu_nmi_enter() and rcu_nmi_exit()
> > 	as needed, so this fourth patch now removes the rcu_irq_enter()
> > 	and rcu_irq_exit() from within the stack tracer.
> 
> I think it's a bit too much ;-)  I believe it talks too much about the
> RCU internals, and the bug will be lost on us mere mortals.
> 
> > 
> > > Actually, thinking about this more, this doesn't need to go in stable.
> > > As recursive rcu_irq_enter() calls should not hurt, and you now allow
> > > rcu_irq_enter() to be called even after a rcu_nmi_enter() right?  
> > 
> > Yes, it is now the case that rcu_irq_enter() can be called even after
> > an rcu_nmi_enter() exited idle, because rcu_irq_enter() now checks for
> > this and takes an early exit if so.
> > 
> > But what is it about older kernels prevents the tracing-induced recursive
> > calls to rcu_irq_enter()?
> 
> Ah, the bug is if rcu_irq_enter() is called, and the stack trace
> triggers then. OK, lets keep it simple, and just say this.
> 
> 
>     Currently the stack tracer calls rcu_irq_enter() to make sure RCU
>     is watching when it records a stack trace. But if the stack tracer
>     is triggered while tracing inside of a rcu_irq_enter(), calling
>     rcu_irq_enter() unconditionally can be problematic.
> 
>     The reason for having rcu_irq_enter() in the first place has been
>     fixed from within the saving of the stack trace code, and there's no
>     reason for doing it in the stack tracer itself. Just remove it.
> 
> Simple and to the point.

Works for me!

							Thanx, Paul




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]