Re: [PATCH-3.2.y-3.12.y] sched: Avoid throttle_cfs_rq() racing with period_timer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/18/2013 02:20 PM, Greg Kroah-Hartman wrote:
> On Thu, Dec 12, 2013 at 10:34:18AM -0600, Chris J Arges wrote:
>> Hi stable maintainers,
>>
>> Please consider the following commit for 3.2-3.12 kernels:
>>
>>  commit f9f9ffc237dd924f048204e8799da74f9ecf40cf
>>  Author: Ben Segall <bsegall@xxxxxxxxxx>
>>  Date:   Wed Oct 16 11:16:32 2013 -0700
>>
>>      sched: Avoid throttle_cfs_rq() racing with period_timer stopping
>>
>>      throttle_cfs_rq() doesn't check to make sure that period_timer is
>> running,
>>      and while update_curr/assign_cfs_runtime does, a concurrently running
>>      period_timer on another cpu could cancel itself between this cpu's
>>      update_curr and throttle_cfs_rq(). If there are no other cfs_rqs
>> running
>>      in the tg to restart the timer, this causes the cfs_rq to be stranded
>>      forever.
>>
>>      Fix this by calling __start_cfs_bandwidth() in throttle if the timer is
>>      inactive.
>>
>>      (Also add some sched_debug lines for cfs_bandwidth.)
>>
>>      Tested: make a run/sleep task in a cgroup, loop switching the cgroup
>>      between 1ms/100ms quota and unlimited, checking for timer_active=0 and
>>      throttled=1 as a failure. With the throttle_cfs_rq() change
>> commented out
>>      this fails, with the full patch it passes.
>>
>>      Signed-off-by: Ben Segall <bsegall@xxxxxxxxxx>
>>      Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>>      Cc: pjt@xxxxxxxxxx
>>      Link:
>> http://lkml.kernel.org/r/20131016181632.22647.84174.stgit@xxxxxxxxxxxxxxxxxxxxx.
>>      Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
>>
>>
>> A bug was noticed when running VMs and setting cpu.cfs_quota for that
>> vcpu's cpu cgroup. Occasionally when rebooting or shutting down a VM the
>> vcpu task would get stuck in a state where the task has a timer disabled
>> and no longer gets scheduled on the cfs_rq. This patch fixes
>> throttle_cfs_rq by checking if the timer is not active, and then calling
>> __start_cfs_bandwidth.
>>
>> I have only built this against 3.2, 3.5, 3.8 and 3.11. 3.2 and 3.5
>> required trivial backports. While, 3.8/3.11 were clean cherry-picks.
> 
> For 3.4, this breaks the build, care to provide a "buildable" backport
> for that kernel version?
> 
> thanks,
> 
> greg k-h
> 

Hi,
Attached is a clean cherry-pick I was able to produce against 3.4.74.
I did 'make defconfig && make' and it built without any issues.
I also built with CONFIG_CFS_BANDWIDTH=y.
If this also doesn't apply, let me know what the build error is, or
config and I'll be sure to fix and test the patch accordingly.
Thanks,
--chris j arges


>From c3fb509ad69871348ae2da9927e5f28791f4a9f9 Mon Sep 17 00:00:00 2001
From: Ben Segall <bsegall@xxxxxxxxxx>
Date: Wed, 16 Oct 2013 11:16:32 -0700
Subject: [PATCH] sched: Avoid throttle_cfs_rq() racing with period_timer
 stopping

throttle_cfs_rq() doesn't check to make sure that period_timer is running,
and while update_curr/assign_cfs_runtime does, a concurrently running
period_timer on another cpu could cancel itself between this cpu's
update_curr and throttle_cfs_rq(). If there are no other cfs_rqs running
in the tg to restart the timer, this causes the cfs_rq to be stranded
forever.

Fix this by calling __start_cfs_bandwidth() in throttle if the timer is
inactive.

(Also add some sched_debug lines for cfs_bandwidth.)

Tested: make a run/sleep task in a cgroup, loop switching the cgroup
between 1ms/100ms quota and unlimited, checking for timer_active=0 and
throttled=1 as a failure. With the throttle_cfs_rq() change commented out
this fails, with the full patch it passes.

Signed-off-by: Ben Segall <bsegall@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: pjt@xxxxxxxxxx
Link: http://lkml.kernel.org/r/20131016181632.22647.84174.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
(cherry picked from commit f9f9ffc237dd924f048204e8799da74f9ecf40cf)

Signed-off-by: Chris J Arges <chris.j.arges@xxxxxxxxxxxxx>
---
 kernel/sched/debug.c |    8 ++++++++
 kernel/sched/fair.c  |    2 ++
 2 files changed, 10 insertions(+)

diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 09acaa1..2f9ac8f 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -215,6 +215,14 @@ void print_cfs_rq(struct seq_file *m, int cpu, struct cfs_rq *cfs_rq)
 	SEQ_printf(m, "  .%-30s: %d\n", "load_tg",
 			atomic_read(&cfs_rq->tg->load_weight));
 #endif
+#ifdef CONFIG_CFS_BANDWIDTH
+	SEQ_printf(m, "  .%-30s: %d\n", "tg->cfs_bandwidth.timer_active",
+			cfs_rq->tg->cfs_bandwidth.timer_active);
+	SEQ_printf(m, "  .%-30s: %d\n", "throttled",
+			cfs_rq->throttled);
+	SEQ_printf(m, "  .%-30s: %d\n", "throttle_count",
+			cfs_rq->throttle_count);
+#endif
 
 	print_cfs_group_stats(m, cpu, cfs_rq->tg);
 #endif
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 33c8b60..efe9253 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1655,6 +1655,8 @@ static void throttle_cfs_rq(struct cfs_rq *cfs_rq)
 	cfs_rq->throttled_timestamp = rq->clock;
 	raw_spin_lock(&cfs_b->lock);
 	list_add_tail_rcu(&cfs_rq->throttled_list, &cfs_b->throttled_cfs_rq);
+	if (!cfs_b->timer_active)
+		__start_cfs_bandwidth(cfs_b);
 	raw_spin_unlock(&cfs_b->lock);
 }
 
-- 
1.7.9.5


[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]