Re: [BUG almost bisected] Splat in dequeue_rt_stack() and build error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 16, 2024 at 11:36:25AM -0800, Paul E. McKenney wrote:
> On Mon, Dec 16, 2024 at 03:38:20PM +0100, Tomas Glozar wrote:
> > ne 15. 12. 2024 v 19:41 odesílatel Paul E. McKenney <paulmck@xxxxxxxxxx> napsal:
> > >
> > > And the fix for the TREE03 too-short grace periods is finally in, at
> > > least in prototype form:
> > >
> > > https://lore.kernel.org/all/da5065c4-79ba-431f-9d7e-1ca314394443@paulmck-laptop/
> > >
> > > Or this commit on -rcu:
> > >
> > > 22bee20913a1 ("rcu: Fix get_state_synchronize_rcu_full() GP-start detection")
> > >
> > > This passes more than 30 hours of 400 concurrent instances of rcutorture's
> > > TREE03 scenario, with modifications that brought the bug reproduction
> > > rate up to 50 per hour.  I therefore have strong reason to believe that
> > > this fix is a real fix.
> > >
> > > With this fix in place, a 20-hour run of 400 concurrent instances
> > > of rcutorture's TREE03 scenario resulted in 50 instances of the
> > > enqueue_dl_entity() splat pair.  One (untrimmed) instance of this pair
> > > of splats is shown below.
> > >
> > > You guys did reproduce this some time back, so unless you tell me
> > > otherwise, I will assume that you have this in hand.  I would of course
> > > be quite happy to help, especially with adding carefully chosen debug
> > > (heisenbug and all that) or testing of alleged fixes.
> > >
> > 
> > The same splat was recently reported to LKML [1] and a patchset was
> > sent and merged into tip/sched/urgent that fixes a few bugs around
> > double-enqueue of the deadline server [2]. I'm currently re-running
> > TREE03 with those patches, hoping they will also fix this issue.
> 
> Thank you very much!
> 
> An initial four-hour test of 400 instances of an enhanced TREE03 ran
> error-free.  I would have expected about 10 errors, so this gives me
> 99.9+% confidence that the patches improved things at least a little
> bit and 99% confidence that these patches reduced the error rate by at
> least a factor of two.
> 
> I am starting an overnight run.  If that completes without error, this
> will provide 99% confidence that these patches reduced the error rate
> by at least an order of magnitude.

And we have that level of confidence!

Tested-by: Paul E. McKenney <paulmck@xxxxxxxxxx>




[Index of Archives]     [Linux Kernel]     [Linux USB Development]     [Yosemite News]     [Linux SCSI]

  Powered by Linux