This is a note to let you know that I've just added the patch titled sched/rt: Fix bad task migration for rt tasks to the 5.15-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: sched-rt-fix-bad-task-migration-for-rt-tasks.patch and it can be found in the queue-5.15 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit 27d44e2e15e08ab48ac9291e4257cdde06971c35 Author: Schspa Shi <schspa@xxxxxxxxx> Date: Mon Aug 29 01:03:02 2022 +0800 sched/rt: Fix bad task migration for rt tasks [ Upstream commit feffe5bb274dd3442080ef0e4053746091878799 ] Commit 95158a89dd50 ("sched,rt: Use the full cpumask for balancing") allows find_lock_lowest_rq() to pick a task with migration disabled. The purpose of the commit is to push the current running task on the CPU that has the migrate_disable() task away. However, there is a race which allows a migrate_disable() task to be migrated. Consider: CPU0 CPU1 push_rt_task check is_migration_disabled(next_task) task not running and migration_disabled == 0 find_lock_lowest_rq(next_task, rq); _double_lock_balance(this_rq, busiest); raw_spin_rq_unlock(this_rq); double_rq_lock(this_rq, busiest); <<wait for busiest rq>> <wakeup> task become running migrate_disable(); <context out> deactivate_task(rq, next_task, 0); set_task_cpu(next_task, lowest_rq->cpu); WARN_ON_ONCE(is_migration_disabled(p)); Fixes: 95158a89dd50 ("sched,rt: Use the full cpumask for balancing") Signed-off-by: Schspa Shi <schspa@xxxxxxxxx> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Reviewed-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@xxxxxxx> Reviewed-by: Valentin Schneider <vschneid@xxxxxxxxxx> Tested-by: Dwaine Gonyier <dgonyier@xxxxxxxxxx> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 3aad381f42ed4..b3e2064983952 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2084,6 +2084,7 @@ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq) !cpumask_test_cpu(later_rq->cpu, &task->cpus_mask) || task_running(rq, task) || !dl_task(task) || + is_migration_disabled(task) || !task_on_rq_queued(task))) { double_unlock_balance(rq, later_rq); later_rq = NULL; diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 8d170b5bdae92..4b9281e6b1ccd 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1842,11 +1842,15 @@ static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq *rq) * the mean time, task could have * migrated already or had its affinity changed. * Also make sure that it wasn't scheduled on its rq. + * It is possible the task was scheduled, set + * "migrate_disabled" and then got preempted, so we must + * check the task migration disable flag here too. */ if (unlikely(task_rq(task) != rq || !cpumask_test_cpu(lowest_rq->cpu, &task->cpus_mask) || task_running(rq, task) || !rt_task(task) || + is_migration_disabled(task) || !task_on_rq_queued(task))) { double_unlock_balance(rq, lowest_rq);