Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx> writes: > Cgroup created inside throttled group must inherit current throttle_count. > Broken throttle_count allows to nominate throttled entries as a next buddy, > later this leads to null pointer dereference in pick_next_task_fair(). > > This patch initialize cfs_rq->throttle_count at first enqueue: laziness > allows to skip locking all rq at group creation. Lazy approach also allows > to skip full sub-tree scan at throttling hierarchy (not in this patch). > > Signed-off-by: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx> > Cc: Stable <stable@xxxxxxxxxxxxxxx> # v3.2+ > --- > kernel/sched/fair.c | 19 +++++++++++++++++++ > kernel/sched/sched.h | 2 +- > 2 files changed, 20 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 218f8e83db73..fe809fe169d2 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4185,6 +4185,25 @@ static void check_enqueue_throttle(struct cfs_rq *cfs_rq) > if (!cfs_bandwidth_used()) > return; > > + /* synchronize hierarchical throttle counter */ > + if (unlikely(!cfs_rq->throttle_uptodate)) { > + struct rq *rq = rq_of(cfs_rq); > + struct cfs_rq *pcfs_rq; > + struct task_group *tg; > + > + cfs_rq->throttle_uptodate = 1; > + /* get closest uptodate node because leaves goes first */ > + for (tg = cfs_rq->tg->parent; tg; tg = tg->parent) { > + pcfs_rq = tg->cfs_rq[cpu_of(rq)]; > + if (pcfs_rq->throttle_uptodate) > + break; > + } > + if (tg) { > + cfs_rq->throttle_count = pcfs_rq->throttle_count; > + cfs_rq->throttled_clock_task = rq_clock_task(rq); > + } > + } > + Checking just in enqueue is not sufficient - throttled_lb_pair can check against a cfs_rq that has never been enqueued (and possibly other paths). It might also make sense to go ahead and initialize all the cfs_rqs we skipped over to avoid some n^2 pathological behavior. You could also use throttle_count == -1 or < 0. (We had our own version of this patch that I guess we forgot to push?) > /* an active group must be handled by the update_curr()->put() path */ > if (!cfs_rq->runtime_enabled || cfs_rq->curr) > return; > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 72f1f3087b04..7cbeb92a1cb9 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -437,7 +437,7 @@ struct cfs_rq { > > u64 throttled_clock, throttled_clock_task; > u64 throttled_clock_task_time; > - int throttled, throttle_count; > + int throttled, throttle_count, throttle_uptodate; > struct list_head throttled_list; > #endif /* CONFIG_CFS_BANDWIDTH */ > #endif /* CONFIG_FAIR_GROUP_SCHED */ -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html