This is a note to let you know that I've just added the patch titled sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities to the 3.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: sched-rt-fix-rq-s-cpupri-leak-while-enqueue-dequeue-child-rt-entities.patch and it can be found in the queue-3.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From 757dfcaa41844595964f1220f1d33182dae49976 Mon Sep 17 00:00:00 2001 From: Kirill Tkhai <tkhai@xxxxxxxxx> Date: Wed, 27 Nov 2013 19:59:13 +0400 Subject: sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities From: Kirill Tkhai <tkhai@xxxxxxxxx> commit 757dfcaa41844595964f1220f1d33182dae49976 upstream. This patch touches the RT group scheduling case. Functions inc_rt_prio_smp() and dec_rt_prio_smp() change (global) rq's priority, while rt_rq passed to them may be not the top-level rt_rq. This is wrong, because changing of priority on a child level does not guarantee that the priority is the highest all over the rq. So, this leak makes RT balancing unusable. The short example: the task having the highest priority among all rq's RT tasks (no one other task has the same priority) are waking on a throttle rt_rq. The rq's cpupri is set to the task's priority equivalent, but real rq->rt.highest_prio.curr is less. The patch below fixes the problem. Signed-off-by: Kirill Tkhai <tkhai@xxxxxxxxx> Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx> CC: Steven Rostedt <rostedt@xxxxxxxxxxx> Link: http://lkml.kernel.org/r/49231385567953@xxxxxxxxxxxxxxx Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- kernel/sched/rt.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -942,6 +942,13 @@ inc_rt_prio_smp(struct rt_rq *rt_rq, int { struct rq *rq = rq_of_rt_rq(rt_rq); +#ifdef CONFIG_RT_GROUP_SCHED + /* + * Change rq's cpupri only if rt_rq is the top queue. + */ + if (&rq->rt != rt_rq) + return; +#endif if (rq->online && prio < prev_prio) cpupri_set(&rq->rd->cpupri, rq->cpu, prio); } @@ -951,6 +958,13 @@ dec_rt_prio_smp(struct rt_rq *rt_rq, int { struct rq *rq = rq_of_rt_rq(rt_rq); +#ifdef CONFIG_RT_GROUP_SCHED + /* + * Change rq's cpupri only if rt_rq is the top queue. + */ + if (&rq->rt != rt_rq) + return; +#endif if (rq->online && rt_rq->highest_prio.curr != prev_prio) cpupri_set(&rq->rd->cpupri, rq->cpu, rt_rq->highest_prio.curr); } Patches currently in stable-queue which might be from tkhai@xxxxxxxxx are queue-3.4/sched-rt-fix-rq-s-cpupri-leak-while-enqueue-dequeue-child-rt-entities.patch -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html