Re: [PATCH-3.2.y-3.12.y] sched: Avoid throttle_cfs_rq() racing with period_timer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 12, 2013 at 10:34:18AM -0600, Chris J Arges wrote:
> Hi stable maintainers,
> 
> Please consider the following commit for 3.2-3.12 kernels:
> 
>  commit f9f9ffc237dd924f048204e8799da74f9ecf40cf
>  Author: Ben Segall <bsegall@xxxxxxxxxx>
>  Date:   Wed Oct 16 11:16:32 2013 -0700
> 
>      sched: Avoid throttle_cfs_rq() racing with period_timer stopping
> 
>      throttle_cfs_rq() doesn't check to make sure that period_timer is
> running,
>      and while update_curr/assign_cfs_runtime does, a concurrently running
>      period_timer on another cpu could cancel itself between this cpu's
>      update_curr and throttle_cfs_rq(). If there are no other cfs_rqs
> running
>      in the tg to restart the timer, this causes the cfs_rq to be stranded
>      forever.
> 
>      Fix this by calling __start_cfs_bandwidth() in throttle if the timer is
>      inactive.
> 
>      (Also add some sched_debug lines for cfs_bandwidth.)
> 
>      Tested: make a run/sleep task in a cgroup, loop switching the cgroup
>      between 1ms/100ms quota and unlimited, checking for timer_active=0 and
>      throttled=1 as a failure. With the throttle_cfs_rq() change
> commented out
>      this fails, with the full patch it passes.
> 
>      Signed-off-by: Ben Segall <bsegall@xxxxxxxxxx>
>      Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>      Cc: pjt@xxxxxxxxxx
>      Link:
> http://lkml.kernel.org/r/20131016181632.22647.84174.stgit@xxxxxxxxxxxxxxxxxxxxx.
>      Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
> 
> 
> A bug was noticed when running VMs and setting cpu.cfs_quota for that
> vcpu's cpu cgroup. Occasionally when rebooting or shutting down a VM the
> vcpu task would get stuck in a state where the task has a timer disabled
> and no longer gets scheduled on the cfs_rq. This patch fixes
> throttle_cfs_rq by checking if the timer is not active, and then calling
> __start_cfs_bandwidth.
> 
> I have only built this against 3.2, 3.5, 3.8 and 3.11. 3.2 and 3.5
> required trivial backports. While, 3.8/3.11 were clean cherry-picks.
> 
> BugLink: http://bugs.launchpad.net/bugs/1259645
> 
> Thanks,
> --chris j arges
> --
> To unsubscribe from this list: send the line "unsubscribe stable" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Thanks Chris, I'm queuing this for the 3.5 and 3.11 kernels.

Cheers,
--
Luis
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]