A system that tends to overschedule (such as PREEMPT_RT) will naturally tend to newidle balance often as well. This may have quite a negative impact on performance. This patch attempts to address the overzealous newidle balancing by only allowing it to occur if the previous task was SCHED_OTHER. Some may argue that if the system is going idle, it should try to newidle balance to keep it doing useful work. But the fact is that spending too much time in the load-balancing code demonstrably hurts performance as well. Running oprofile on the system with various workloads has shown that we can sometimes spend a majority of our cpu-time running load_balance_newidle. Additionally, disabling newidle balancing can make said workloads increase in performance by up to 200%. Obviously disabling the feature outright is not sustainable, but hopefully we can make it smarter. This code assumes that if there arent any CFS tasks present on the queue, it was probably already balanced. Signed-off-by: Gregory Haskins <ghaskins@xxxxxxxxxx> --- kernel/sched.c | 4 +--- kernel/sched_fair.c | 9 +++++++++ 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/kernel/sched.c b/kernel/sched.c index 490e6bc..3efbbc5 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -1310,6 +1310,7 @@ static unsigned long source_load(int cpu, int type); static unsigned long target_load(int cpu, int type); static unsigned long cpu_avg_load_per_task(int cpu); static int task_hot(struct task_struct *p, u64 now, struct sched_domain *sd); +static void idle_balance(int this_cpu, struct rq *this_rq); #endif /* CONFIG_SMP */ #include "sched_stats.h" @@ -4170,9 +4171,6 @@ asmlinkage void __sched __schedule(void) prev->sched_class->pre_schedule(rq, prev); #endif - if (unlikely(!rq->nr_running)) - idle_balance(cpu, rq); - prev->sched_class->put_prev_task(rq, prev); next = pick_next_task(rq, prev); diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c index 0ade6f8..2e22529 100644 --- a/kernel/sched_fair.c +++ b/kernel/sched_fair.c @@ -1426,6 +1426,14 @@ static void moved_group_fair(struct task_struct *p) } #endif +#ifdef CONFIG_SMP +static void pre_schedule_fair(struct rq *rq, struct task_struct *prev) +{ + if (unlikely(!rq->nr_running)) + idle_balance(rq->cpu, rq); +} +#endif + /* * All the scheduling class methods: */ @@ -1446,6 +1454,7 @@ static const struct sched_class fair_sched_class = { #ifdef CONFIG_SMP .load_balance = load_balance_fair, .move_one_task = move_one_task_fair, + .pre_schedule = pre_schedule_fair, #endif .set_curr_task = set_curr_task_fair, -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html