Re: [BUG almost bisected] Splat in dequeue_rt_stack() and build error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 27, 2024 at 10:33:13AM -0700, Paul E. McKenney wrote:
> On Tue, Aug 27, 2024 at 05:41:52PM +0200, Valentin Schneider wrote:
> > On 27/08/24 12:03, Valentin Schneider wrote:
> > > On 26/08/24 09:31, Paul E. McKenney wrote:
> > >> On Mon, Aug 26, 2024 at 01:44:35PM +0200, Valentin Schneider wrote:
> > >>>
> > >>> Woops...
> > >>
> > >> On the other hand, removing that dequeue_task() makes next-20240823
> > >> pass light testing.
> > >>
> > >> I have to ask...
> > >>
> > >> Does it make sense for Valentin to rearrange those commits to fix
> > >> the two build bugs and remove that dequeue_task(), all in the name of
> > >> bisectability.  Or is there something subtle here so that only Peter
> > >> can do this work, shoulder and all?
> > >>
> > >
> > > I suppose at the very least another pair of eyes on this can't hurt, let me
> > > get untangled from some other things first and I'll take a jab at it.
> > 
> > I've taken tip/sched/core and shuffled hunks around; I didn't re-order any
> > commit. I've also taken out the dequeue from switched_from_fair() and put
> > it at the very top of the branch which should hopefully help bisection.
> > 
> > The final delta between that branch and tip/sched/core is empty, so it
> > really is just shuffling inbetween commits.
> > 
> > Please find the branch at:
> > 
> > https://gitlab.com/vschneid/linux.git -b mainline/sched/eevdf-complete-builderr
> > 
> > I'll go stare at the BUG itself now.
> 
> Thank you!
> 
> I have fired up tests on the "BROKEN?" commit.  If that fails, I will
> try its predecessor, and if that fails, I wlll bisect from e28b5f8bda01
> ("sched/fair: Assert {set_next,put_prev}_entity() are properly balanced"),
> which has stood up to heavy hammering in earlier testing.

And of 50 runs of TREE03 on the "BROKEN?" commit resulted in 32 failures.
Of these, 29 were the dequeue_rt_stack() failure.  Two more were RCU
CPU stall warnings, and the last one was an oddball "kernel BUG at
kernel/sched/rt.c:1714" followed by an equally oddball "Oops: invalid
opcode: 0000 [#1] PREEMPT SMP PTI".

Just to be specific, this is commit:

df8fe34bfa36 ("BROKEN? sched/fair: Dequeue sched_delayed tasks when switching from fair")

This commit's predecessor is this commit:

2f888533d073 ("sched/eevdf: Propagate min_slice up the cgroup hierarchy")

This predecessor commit passes 50 runs of TREE03 with no failures.

So that addition of that dequeue_task() call to the switched_from_fair()
function is looking quite suspicious to me.  ;-)

							Thanx, Paul




[Index of Archives]     [Linux Kernel]     [Linux USB Development]     [Yosemite News]     [Linux SCSI]

  Powered by Linux