The following commit has been merged into the sched/core branch of tip: Commit-ID: 1c0482707c42960ec46b88aadd6bffca8685db11 Gitweb: https://git.kernel.org/tip/1c0482707c42960ec46b88aadd6bffca8685db11 Author: Vineeth Pillai (Google) <vineeth@xxxxxxxxxxxxxxx> AuthorDate: Fri, 20 Oct 2023 01:40:28 Committer: Ingo Molnar <mingo@xxxxxxxxxx> CommitterDate: Fri, 20 Oct 2023 09:56:21 +02:00 sched/fair: Update ->next_balance correctly during newidle balance When newidle balancing triggers, we see that it constantly clobbers rq->next_balance even when there is no newidle balance happening due to the cost estimates. Due to this, we see that periodic load balance (rebalance_domains) may trigger way more often when the CPU is going in and out of idle at a high rate but is no really idle. Repeatedly triggering load balance there is a bad idea as it is a heavy operation. It also causes increases in softirq. Another issue is ->last_balance is not updated after newidle balance causing mistakes in the ->next_balance calculations. Fix by updating last_balance when a newidle load balance actually happens and then updating next_balance. This is also how it is done in other load balance paths. Testing shows a significant drop in softirqs when running: $ cyclictest -i 100 -d 100 --latency=1000 -D 5 -t -m -q ... goes from ~6,000 to ~800. Co-developed-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> Signed-off-by: Vineeth Pillai (Google) <vineeth@xxxxxxxxxxxxxxx> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Cc: Juri Lelli <juri.lelli@xxxxxxxxxx> Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx> Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxx> Cc: Daniel Bristot de Oliveira <bristot@xxxxxxxxxx> Cc: Valentin Schneider <vschneid@xxxxxxxxxx> Link: https://lore.kernel.org/r/20231020014031.919742-3-joel@xxxxxxxxxxxxxxxxx --- kernel/sched/fair.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 8c486ff..393d0dc 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -12122,11 +12122,7 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf) if (!READ_ONCE(this_rq->rd->overload) || (sd && this_rq->avg_idle < sd->max_newidle_lb_cost)) { - - if (sd) - update_next_balance(sd, &next_balance); rcu_read_unlock(); - goto out; } rcu_read_unlock(); @@ -12141,8 +12137,6 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf) int continue_balancing = 1; u64 domain_cost; - update_next_balance(sd, &next_balance); - if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost) break; @@ -12155,6 +12149,8 @@ static int newidle_balance(struct rq *this_rq, struct rq_flags *rf) t1 = sched_clock_cpu(this_cpu); domain_cost = t1 - t0; update_newidle_cost(sd, domain_cost); + sd->last_balance = jiffies; + update_next_balance(sd, &next_balance); curr_cost += domain_cost; t0 = t1;