On Wed, Jan 16, 2019 at 07:52:08PM +0000, Alakesh Haloi wrote: > [ Upstream commit 512ac999d2755d2b7109e996a76b6fb8b888631d ] > > I noticed that cgroup task groups constantly get throttled even > if they have low CPU usage, this causes some jitters on the response > time to some of our business containers when enabling CPU quotas. > > It's very simple to reproduce: > > mkdir /sys/fs/cgroup/cpu/test > cd /sys/fs/cgroup/cpu/test > echo 100000 > cpu.cfs_quota_us > echo $$ > tasks > > then repeat: > > cat cpu.stat | grep nr_throttled # nr_throttled will increase steadily > > After some analysis, we found that cfs_rq::runtime_remaining will > be cleared by expire_cfs_rq_runtime() due to two equal but stale > "cfs_{b|q}->runtime_expires" after period timer is re-armed. > > The current condition to judge clock drift in expire_cfs_rq_runtime() > is wrong, the two runtime_expires are actually the same when clock > drift happens, so this condtion can never hit. The orginal design was > correctly done by this commit: > > a9cf55b28610 ("sched: Expire invalid runtime") > > ... but was changed to be the current implementation due to its locking bug. > > This patch introduces another way, it adds a new field in both structures > cfs_rq and cfs_bandwidth to record the expiration update sequence, and > uses them to figure out if clock drift happens (true if they are equal). > > Signed-off-by: Xunlei Pang <xlpang@xxxxxxxxxxxxxxxxx> > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > [alakeshh: backport: Fixed merge conflicts: > - sched.h: Fix the indentation and order in which the variables are > declared to match with coding style of the existing code in 4.14 > Struct members of same type were declared in separate lines in > upstream patch which has been changed back to having multiple > members of same type in the same line. > e.g. int a; int b; -> int a, b; ] Now queued up, thanks! greg k-h