On 12/18/2013 02:20 PM, Greg Kroah-Hartman wrote: > On Thu, Dec 12, 2013 at 10:34:18AM -0600, Chris J Arges wrote: >> Hi stable maintainers, >> >> Please consider the following commit for 3.2-3.12 kernels: >> >> commit f9f9ffc237dd924f048204e8799da74f9ecf40cf >> Author: Ben Segall <bsegall@xxxxxxxxxx> >> Date: Wed Oct 16 11:16:32 2013 -0700 >> >> sched: Avoid throttle_cfs_rq() racing with period_timer stopping >> >> throttle_cfs_rq() doesn't check to make sure that period_timer is >> running, >> and while update_curr/assign_cfs_runtime does, a concurrently running >> period_timer on another cpu could cancel itself between this cpu's >> update_curr and throttle_cfs_rq(). If there are no other cfs_rqs >> running >> in the tg to restart the timer, this causes the cfs_rq to be stranded >> forever. >> >> Fix this by calling __start_cfs_bandwidth() in throttle if the timer is >> inactive. >> >> (Also add some sched_debug lines for cfs_bandwidth.) >> >> Tested: make a run/sleep task in a cgroup, loop switching the cgroup >> between 1ms/100ms quota and unlimited, checking for timer_active=0 and >> throttled=1 as a failure. With the throttle_cfs_rq() change >> commented out >> this fails, with the full patch it passes. >> >> Signed-off-by: Ben Segall <bsegall@xxxxxxxxxx> >> Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx> >> Cc: pjt@xxxxxxxxxx >> Link: >> http://lkml.kernel.org/r/20131016181632.22647.84174.stgit@xxxxxxxxxxxxxxxxxxxxx. >> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> >> >> >> A bug was noticed when running VMs and setting cpu.cfs_quota for that >> vcpu's cpu cgroup. Occasionally when rebooting or shutting down a VM the >> vcpu task would get stuck in a state where the task has a timer disabled >> and no longer gets scheduled on the cfs_rq. This patch fixes >> throttle_cfs_rq by checking if the timer is not active, and then calling >> __start_cfs_bandwidth. >> >> I have only built this against 3.2, 3.5, 3.8 and 3.11. 3.2 and 3.5 >> required trivial backports. While, 3.8/3.11 were clean cherry-picks. > > For 3.4, this breaks the build, care to provide a "buildable" backport > for that kernel version? > > thanks, > > greg k-h > Hi, Attached is a clean cherry-pick I was able to produce against 3.4.74. I did 'make defconfig && make' and it built without any issues. I also built with CONFIG_CFS_BANDWIDTH=y. If this also doesn't apply, let me know what the build error is, or config and I'll be sure to fix and test the patch accordingly. Thanks, --chris j arges
>From c3fb509ad69871348ae2da9927e5f28791f4a9f9 Mon Sep 17 00:00:00 2001 From: Ben Segall <bsegall@xxxxxxxxxx> Date: Wed, 16 Oct 2013 11:16:32 -0700 Subject: [PATCH] sched: Avoid throttle_cfs_rq() racing with period_timer stopping throttle_cfs_rq() doesn't check to make sure that period_timer is running, and while update_curr/assign_cfs_runtime does, a concurrently running period_timer on another cpu could cancel itself between this cpu's update_curr and throttle_cfs_rq(). If there are no other cfs_rqs running in the tg to restart the timer, this causes the cfs_rq to be stranded forever. Fix this by calling __start_cfs_bandwidth() in throttle if the timer is inactive. (Also add some sched_debug lines for cfs_bandwidth.) Tested: make a run/sleep task in a cgroup, loop switching the cgroup between 1ms/100ms quota and unlimited, checking for timer_active=0 and throttled=1 as a failure. With the throttle_cfs_rq() change commented out this fails, with the full patch it passes. Signed-off-by: Ben Segall <bsegall@xxxxxxxxxx> Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Cc: pjt@xxxxxxxxxx Link: http://lkml.kernel.org/r/20131016181632.22647.84174.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx> (cherry picked from commit f9f9ffc237dd924f048204e8799da74f9ecf40cf) Signed-off-by: Chris J Arges <chris.j.arges@xxxxxxxxxxxxx> --- kernel/sched/debug.c | 8 ++++++++ kernel/sched/fair.c | 2 ++ 2 files changed, 10 insertions(+) diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index 09acaa1..2f9ac8f 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -215,6 +215,14 @@ void print_cfs_rq(struct seq_file *m, int cpu, struct cfs_rq *cfs_rq) SEQ_printf(m, " .%-30s: %d\n", "load_tg", atomic_read(&cfs_rq->tg->load_weight)); #endif +#ifdef CONFIG_CFS_BANDWIDTH + SEQ_printf(m, " .%-30s: %d\n", "tg->cfs_bandwidth.timer_active", + cfs_rq->tg->cfs_bandwidth.timer_active); + SEQ_printf(m, " .%-30s: %d\n", "throttled", + cfs_rq->throttled); + SEQ_printf(m, " .%-30s: %d\n", "throttle_count", + cfs_rq->throttle_count); +#endif print_cfs_group_stats(m, cpu, cfs_rq->tg); #endif diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 33c8b60..efe9253 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1655,6 +1655,8 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq) cfs_rq->throttled_timestamp = rq->clock; raw_spin_lock(&cfs_b->lock); list_add_tail_rcu(&cfs_rq->throttled_list, &cfs_b->throttled_cfs_rq); + if (!cfs_b->timer_active) + __start_cfs_bandwidth(cfs_b); raw_spin_unlock(&cfs_b->lock); } -- 1.7.9.5