On Thu, Aug 29, 2024 at 03:50:03PM +0200, Valentin Schneider wrote: > On 29/08/24 03:28, Paul E. McKenney wrote: > > On Wed, Aug 28, 2024 at 11:39:19AM -0700, Paul E. McKenney wrote: > >> > >> The 500*TREE03 run had exactly one failure that was the dreaded > >> enqueue_dl_entity() failure, followed by RCU CPU stall warnings. > >> > >> But a huge improvement over the prior state! > >> > >> Plus, this failure is likely unrelated (see earlier discussions with > >> Peter). I just started a 5000*TREE03 run, just in case we can now > >> reproduce this thing. > > > > And we can now reproduce it! Again, this might an unrelated bug that > > was previously a one-off (OK, OK, a two-off!). Or this series might > > have made it more probably. Who knows? > > > > Eight of those 5000 runs got us this splat in enqueue_dl_entity(): > > > > WARN_ON_ONCE(on_dl_rq(dl_se)); > > > > Immediately followed by this splat in __enqueue_dl_entity(): > > > > WARN_ON_ONCE(!RB_EMPTY_NODE(&dl_se->rb_node)); > > > > These two splats always happened during rcutorture's testing of > > RCU priority boosting. This testing involves spawning a CPU-bound > > low-priority real-time kthread for each CPU, which is intended to starve > > the non-realtime RCU readers, which are in turn to be rescued by RCU > > priority boosting. > > > > Thanks! > > > I do not entirely trust the following rcutorture diagnostic, but just > > in case it helps... > > > > Many of them also printed something like this as well: > > > > [ 111.279575] Boost inversion persisted: No QS from CPU 3 > > > > This message means that rcutorture has decided that RCU priority boosting > > has failed, but not because a low-priority preempted task was blocking > > the grace period, but rather because some CPU managed to be running > > the same task in-kernel the whole time without doing a context switch. > > In some cases (but not this one), this was simply a side-effect of > > RCU's grace-period kthread being starved of CPU time. Such starvation > > is a surprise in this case because this kthread is running at higher > > real-time priority than the kthreads that are intended to force RCU > > priority boosting to happen. > > > > Again, I do not entirely trust this rcutorture diagnostic, just in case > > it helps. > > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > [ 287.536845] rcu-torture: rcu_torture_boost is stopping > > [ 287.536867] ------------[ cut here ]------------ > > [ 287.540661] WARNING: CPU: 4 PID: 132 at kernel/sched/deadline.c:2003 enqueue_dl_entity+0x50d/0x5c0 > > [ 287.542299] Modules linked in: > > [ 287.542868] CPU: 4 UID: 0 PID: 132 Comm: kcompactd0 Not tainted 6.11.0-rc1-00051-gb32d207e39de #1701 > > [ 287.544335] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > > [ 287.546337] RIP: 0010:enqueue_dl_entity+0x50d/0x5c0 > > [ 287.603245] ? __warn+0x7e/0x120 > > [ 287.603752] ? enqueue_dl_entity+0x54b/0x5c0 > > [ 287.604405] ? report_bug+0x18e/0x1a0 > > [ 287.604978] ? handle_bug+0x3d/0x70 > > [ 287.605523] ? exc_invalid_op+0x18/0x70 > > [ 287.606116] ? asm_exc_invalid_op+0x1a/0x20 > > [ 287.606765] ? enqueue_dl_entity+0x54b/0x5c0 > > [ 287.607420] dl_server_start+0x31/0xe0 > > [ 287.608013] enqueue_task_fair+0x218/0x680 > > [ 287.608643] activate_task+0x21/0x50 > > [ 287.609197] attach_task+0x30/0x50 > > [ 287.609736] sched_balance_rq+0x65d/0xe20 > > [ 287.610351] sched_balance_newidle.constprop.0+0x1a0/0x360 > > [ 287.611205] pick_next_task_fair+0x2a/0x2e0 > > [ 287.611849] __schedule+0x106/0x8b0 > > > Assuming this is still related to switched_from_fair(), since this is hit > during priority boosting then it would mean rt_mutex_setprio() gets > involved, but that uses the same set of DQ/EQ flags as > __sched_setscheduler(). > > I don't see any obvious path in > > dequeue_task_fair() > `\ > dequeue_entities() > > that would prevent dl_server_stop() from happening when doing the > class-switch dequeue_task()... I don't see it in the TREE03 config, but can > you confirm CONFIG_CFS_BANDWIDTH isn't set in that scenario? > > I'm going to keep digging but I'm not entirely sure yet whether this is > related to the switched_from_fair() hackery or not, I'll send the patch I > have as-is and continue digging for a bit. Makes sense to me, thank you, and glad that the diagnostics helped. Looking forward to further fixes. ;-) Thanx, Paul