On Sun, Oct 06, 2024 at 01:44:53PM -0700, Paul E. McKenney wrote: > With your patch, I got 24 failures out of 100 TREE03 runs of 18 hours > each. The failures were different, though, mostly involving boost > failures in which RCU priority boosting didn't actually result in the > low-priority readers getting boosted. Somehow I feel this is progress, albeit very minor :/ > There were also a number of "sched: DL replenish lagged too much" > messages, but it looks like this was a symptom of the ftrace dump. > > Given that this now involves priority boosting, I am trying 400*TREE03 > with each guest OS restricted to four CPUs to see if that makes things > happen more quickly, and will let you know how this goes. > > Any other debug I should apply? The sched_pi_setprio tracepoint perhaps? I've read all the RCU_BOOST and rtmutex code (once again), and I've been running pi_stress with --sched id=low,policy=other to ensure the code paths in question are taken. But so far so very nothing :/ (Noting that both RCU_BOOST and PI futexes use the same rt_mutex / PI API) You know RCU_BOOST better than me.. then again, it is utterly weird this is apparently affected. I've gotta ask, a kernel with my patch on and additionally flipping kernel/sched/features.h:SCHED_FEAT(DELAY_DEQUEUE, false) functions as expected? One very minor thing I noticed while I read the code, do with as you think best... diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 1c7cbd145d5e..95061119653d 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1071,10 +1071,6 @@ static int rcu_boost(struct rcu_node *rnp) * Recheck under the lock: all tasks in need of boosting * might exit their RCU read-side critical sections on their own. */ - if (rnp->exp_tasks == NULL && rnp->boost_tasks == NULL) { - raw_spin_unlock_irqrestore_rcu_node(rnp, flags); - return 0; - } /* * Preferentially boost tasks blocking expedited grace periods. @@ -1082,10 +1078,13 @@ static int rcu_boost(struct rcu_node *rnp) * expedited grace period must boost all blocked tasks, including * those blocking the pre-existing normal grace period. */ - if (rnp->exp_tasks != NULL) - tb = rnp->exp_tasks; - else + tb = rnp->exp_tasks; + if (!tb) tb = rnp->boost_tasks; + if (!tb) { + raw_spin_unlock_irqrestore_rcu_node(rnp, flags); + return 0; + } /* * We boost task t by manufacturing an rt_mutex that appears to