The following commit has been merged into the sched/urgent branch of tip: Commit-ID: 16b0a7a1a0af9db6e008fecd195fe4d6cb366d83 Gitweb: https://git.kernel.org/tip/16b0a7a1a0af9db6e008fecd195fe4d6cb366d83 Author: Vincent Guittot <vincent.guittot@xxxxxxxxxx> AuthorDate: Mon, 02 Nov 2020 11:24:57 +01:00 Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx> CommitterDate: Tue, 10 Nov 2020 18:38:48 +01:00 sched/fair: Ensure tasks spreading in LLC during LB schbench shows latency increase for 95 percentile above since: commit 0b0695f2b34a ("sched/fair: Rework load_balance()") Align the behavior of the load balancer with the wake up path, which tries to select an idle CPU which belongs to the LLC for a waking task. calculate_imbalance() will use nr_running instead of the spare capacity when CPUs share resources (ie cache) at the domain level. This will ensure a better spread of tasks on idle CPUs. Running schbench on a hikey (8cores arm64) shows the problem: tip/sched/core : schbench -m 2 -t 4 -s 10000 -c 1000000 -r 10 Latency percentiles (usec) 50.0th: 33 75.0th: 45 90.0th: 51 95.0th: 4152 *99.0th: 14288 99.5th: 14288 99.9th: 14288 min=0, max=14276 tip/sched/core + patch : schbench -m 2 -t 4 -s 10000 -c 1000000 -r 10 Latency percentiles (usec) 50.0th: 34 75.0th: 47 90.0th: 52 95.0th: 78 *99.0th: 94 99.5th: 94 99.9th: 94 min=0, max=94 Fixes: 0b0695f2b34a ("sched/fair: Rework load_balance()") Reported-by: Chris Mason <clm@xxxxxx> Suggested-by: Rik van Riel <riel@xxxxxxxxxxx> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Reviewed-by: Rik van Riel <riel@xxxxxxxxxxx> Tested-by: Rik van Riel <riel@xxxxxxxxxxx> Link: https://lkml.kernel.org/r/20201102102457.28808-1-vincent.guittot@xxxxxxxxxx --- kernel/sched/fair.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index aa4c622..210b15f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9031,7 +9031,8 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s * emptying busiest. */ if (local->group_type == group_has_spare) { - if (busiest->group_type > group_fully_busy) { + if ((busiest->group_type > group_fully_busy) && + !(env->sd->flags & SD_SHARE_PKG_RESOURCES)) { /* * If busiest is overloaded, try to fill spare * capacity. This might end up creating spare capacity