Hi, On 2024-08-28 at 14:35:45 +0200, Valentin Schneider wrote: > On 27/08/24 13:36, Paul E. McKenney wrote: > > On Tue, Aug 27, 2024 at 10:30:24PM +0200, Valentin Schneider wrote: > >> On 27/08/24 11:35, Paul E. McKenney wrote: > >> > On Tue, Aug 27, 2024 at 10:33:13AM -0700, Paul E. McKenney wrote: > >> >> On Tue, Aug 27, 2024 at 05:41:52PM +0200, Valentin Schneider wrote: > >> >> > I've taken tip/sched/core and shuffled hunks around; I didn't re-order any > >> >> > commit. I've also taken out the dequeue from switched_from_fair() and put > >> >> > it at the very top of the branch which should hopefully help bisection. > >> >> > > >> >> > The final delta between that branch and tip/sched/core is empty, so it > >> >> > really is just shuffling inbetween commits. > >> >> > > >> >> > Please find the branch at: > >> >> > > >> >> > https://gitlab.com/vschneid/linux.git -b mainline/sched/eevdf-complete-builderr > >> >> > > >> >> > I'll go stare at the BUG itself now. > >> >> > >> >> Thank you! > >> >> > >> >> I have fired up tests on the "BROKEN?" commit. If that fails, I will > >> >> try its predecessor, and if that fails, I wlll bisect from e28b5f8bda01 > >> >> ("sched/fair: Assert {set_next,put_prev}_entity() are properly balanced"), > >> >> which has stood up to heavy hammering in earlier testing. > >> > > >> > And of 50 runs of TREE03 on the "BROKEN?" commit resulted in 32 failures. > >> > Of these, 29 were the dequeue_rt_stack() failure. Two more were RCU > >> > CPU stall warnings, and the last one was an oddball "kernel BUG at > >> > kernel/sched/rt.c:1714" followed by an equally oddball "Oops: invalid > >> > opcode: 0000 [#1] PREEMPT SMP PTI". > >> > > >> > Just to be specific, this is commit: > >> > > >> > df8fe34bfa36 ("BROKEN? sched/fair: Dequeue sched_delayed tasks when switching from fair") > >> > > >> > This commit's predecessor is this commit: > >> > > >> > 2f888533d073 ("sched/eevdf: Propagate min_slice up the cgroup hierarchy") > >> > > >> > This predecessor commit passes 50 runs of TREE03 with no failures. > >> > > >> > So that addition of that dequeue_task() call to the switched_from_fair() > >> > function is looking quite suspicious to me. ;-) > >> > > >> > Thanx, Paul > >> > >> Thanks for the testing! > >> > >> The WARN_ON_ONCE(!rt_se->on_list); hit in __dequeue_rt_entity() feels like > >> a put_prev/set_next kind of issue... > >> > >> So far I'd assumed a ->sched_delayed task can't be current during > >> switched_from_fair(), I got confused because it's Mond^CCC Tuesday, but I > >> think that still holds: we can't get a balance_dl() or balance_rt() to drop > >> the RQ lock because prev would be fair, and we can't get a > >> newidle_balance() with a ->sched_delayed task because we'd have > >> sched_fair_runnable() := true. > >> > >> I'll pick this back up tomorrow, this is a task that requires either > >> caffeine or booze and it's too late for either. > > > > Thank you for chasing this, and get some sleep! This one is of course > > annoying, but it is not (yet) an emergency. I look forward to seeing > > what you come up with. > > > > Also, I would of course be happy to apply debug patches. > > > > Thanx, Paul > > Chen Yu made me realize [1] that dequeue_task() really isn't enough; the > dequeue_task() in e.g. __sched_setscheduler() won't have DEQUEUE_DELAYED, > so stuff will just be left on the CFS tree. > One question, although there is no DEQUEUE_DELAYED flag, it is possible the delayed task could be dequeued from CFS tree. Because the dequeue in set_schedule() does not have DEQUEUE_SLEEP. And in dequeue_entity(): bool sleep = flags & DEQUEUE_SLEEP; if (flags & DEQUEUE_DELAYED) { } else { bool delay = sleep; if (sched_feat(DELAY_DEQUEUE) && delay && //false !entity_eligible(cfs_rq, se) { //do not dequeue } } //dequeue the task <---- we should reach here? thanks, Chenyu > Worse, what we need here is the __block_task() like we have at the end of > dequeue_entities(), otherwise p stays ->on_rq and that's borked - AFAICT > that explains the splat you're getting, because affine_move_task() ends up > doing a move_queued_task() for what really is a dequeued task. > > I unfortunately couldn't reproduce the issue locally using your TREE03 > invocation. I've pushed a new patch on top of my branch, would you mind > giving it a spin? It's a bit sketchy but should at least be going in the > right direction... > > [1]: http://lore.kernel.org/r/Zs2d2aaC/zSyR94v@chenyu5-mobl2 >