The patch titled sched-avoid-unnecessarily-moving-highest-priority-task-move_tasks fix has been added to the -mm tree. Its filename is sched-avoid-unnecessarily-moving-highest-priority-task-move_tasks-fix.patch See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find out what to do about this From: "Siddha, Suresh B" <suresh.b.siddha@xxxxxxxxx> On Sat, Apr 22, 2006 at 11:31:29AM +1000, Peter Williams wrote: > If there are more than one task with the highest priority then it is > desirable to move one of them by overriding the skip mechanism as it can > be considered the second highest priority task. I think your patch is not doing what you intend to do. > This initialization > just checks to see if the currently running task is one of the highest > priority tasks. If it is then it's OK to move the first task we find > that also has the same priority otherwise we wait until we've skipped > one before we move one. If this currently running task is of the highest priority, we set busiest_best_prio_seen to '1' and looking at the code, because of this we never consider any other busiest task which we come across on the expired list.. This is coming from this piece of code. skip_for_load = busiest_best_prio_seen || idx != busiest_best_prio; skip_for_load is always set to '1'(because of busiest_best_prio_seen) and we will never be able to move any busiest task to the new queue. > > This patch doesn't address the issue where we can skip the highest priority > > task movement if there is only one such task on the busy runqueue > > (and is on the expired list..) > > I think that it does. No. It doesn't. In this case busiest_best_prio_seen will be set to '0', when we traverse the only highest priority task on this queue(which happens to be on expired list), we set skip_for_load to '0' And we will try pulling the only highest priority task on this queue to the new queue.. Appended patch(ontop of sched-avoid-unnecessarily-moving-highest-priority-task-move_tasks.patch) fixes these issues. Signed-off-by: Suresh Siddha <suresh.b.siddha@xxxxxxxxx> Cc: Peter Williams <pwil3058@xxxxxxxxxxxxxx> Cc: Ingo Molnar <mingo@xxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxx> --- kernel/sched.c | 22 ++++++++++++---------- 1 files changed, 12 insertions(+), 10 deletions(-) diff -puN kernel/sched.c~sched-avoid-unnecessarily-moving-highest-priority-task-move_tasks-fix kernel/sched.c --- 25/kernel/sched.c~sched-avoid-unnecessarily-moving-highest-priority-task-move_tasks-fix Mon Apr 24 15:58:28 2006 +++ 25-akpm/kernel/sched.c Mon Apr 24 15:58:28 2006 @@ -1996,7 +1996,7 @@ static int move_tasks(runqueue_t *this_r prio_array_t *array, *dst_array; struct list_head *head, *curr; int idx, pulled = 0, pinned = 0, this_best_prio, busiest_best_prio; - int busiest_best_prio_seen; + int busiest_best_prio_seen = 0; int skip_for_load; /* skip the task based on weighted load issues */ long rem_load_move; task_t *tmp; @@ -2008,11 +2008,6 @@ static int move_tasks(runqueue_t *this_r pinned = 1; this_best_prio = rq_best_prio(this_rq); busiest_best_prio = rq_best_prio(busiest); - /* - * Enable handling of the case where there is more than one task - * with the best priority. - */ - busiest_best_prio_seen = busiest_best_prio == busiest->curr->prio; /* * We first consider expired tasks. Those will likely not be @@ -2023,6 +2018,13 @@ static int move_tasks(runqueue_t *this_r if (busiest->expired->nr_active) { array = busiest->expired; dst_array = this_rq->expired; + /* + * We already have one or more busiest best prio tasks on + * active list. So if we encounter any busiest best prio task + * on expired list, consider it for the move, if it becomes + * the best prio on new queue. + */ + busiest_best_prio_seen = busiest_best_prio == busiest->curr->prio; } else { array = busiest->active; dst_array = this_rq->active; @@ -2040,6 +2042,7 @@ skip_bitmap: if (array == busiest->expired && busiest->active->nr_active) { array = busiest->active; dst_array = this_rq->active; + busiest_best_prio_seen = 0; goto new_array; } goto out; @@ -2058,8 +2061,9 @@ skip_queue: * prio value) on its new queue regardless of its load weight */ skip_for_load = tmp->load_weight > rem_load_move; - if (skip_for_load && idx < this_best_prio) - skip_for_load = busiest_best_prio_seen || idx != busiest_best_prio; + if (skip_for_load && idx < this_best_prio && idx == busiest_best_prio) + skip_for_load = !busiest_best_prio_seen && + head->next == head->prev; if (skip_for_load || !can_migrate_task(tmp, busiest, this_cpu, sd, idle, &pinned)) { if (curr != head) @@ -2084,8 +2088,6 @@ skip_queue: if (pulled < max_nr_move && rem_load_move > 0) { if (idx < this_best_prio) this_best_prio = idx; - if (idx == busiest_best_prio) - busiest_best_prio_seen = 1; if (curr != head) goto skip_queue; idx++; _ Patches currently in -mm which might be from suresh.b.siddha@xxxxxxxxx are sched-implement-smpnice.patch sched-prevent-high-load-weight-tasks-suppressing-balancing.patch sched-improve-stability-of-smpnice-load-balancing.patch sched-improve-smpnice-load-balancing-when-load-per-task.patch smpnice-dont-consider-sched-groups-which-are-lightly-loaded-for-balancing.patch smpnice-dont-consider-sched-groups-which-are-lightly-loaded-for-balancing-fix.patch sched-modify-move_tasks-to-improve-load-balancing-outcomes.patch sched-avoid-unnecessarily-moving-highest-priority-task-move_tasks.patch sched-avoid-unnecessarily-moving-highest-priority-task-move_tasks-fix.patch sched_domain-handle-kmalloc-failure.patch sched_domain-handle-kmalloc-failure-fix.patch sched_domain-dont-use-gfp_atomic.patch sched_domain-use-kmalloc_node.patch sched_domain-allocate-sched_group-structures-dynamically.patch - To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html