On Tue, Jan 22, 2019 at 02:43:29PM +0000, Patrick Bellasi wrote: > On 22-Jan 14:56, Peter Zijlstra wrote: > > On Tue, Jan 15, 2019 at 10:15:04AM +0000, Patrick Bellasi wrote: > > > > > diff --git a/include/linux/sched.h b/include/linux/sched.h > > > index 84294925d006..c8f391d1cdc5 100644 > > > --- a/include/linux/sched.h > > > +++ b/include/linux/sched.h > > > @@ -625,6 +625,11 @@ struct uclamp_se { > > > unsigned int bucket_id : bits_per(UCLAMP_BUCKETS); > > > unsigned int mapped : 1; > > > unsigned int active : 1; > > > + /* Clamp bucket and value actually used by a RUNNABLE task */ > > > + struct { > > > + unsigned int value : bits_per(SCHED_CAPACITY_SCALE); > > > + unsigned int bucket_id : bits_per(UCLAMP_BUCKETS); > > > + } effective; > > > > I am confuzled by this thing.. so uclamp_se already has a value,bucket, > > which per the prior code is the effective one. > > > > Now; I think I see why you want another value; you need the second to > > store the original value for when the system limits change and we must > > re-evaluate. > > Yes, that's one reason, the other one being to properly support > CGroup when we add them in the following patches. > > Effective will always track the value/bucket in which the task has > been refcounted at enqueue time and it depends on the aggregated > value. > > Should you not update all tasks? > > That's true, but that's also an expensive operation, that's why now > I'm doing only lazy updates at next enqueue time. Aaah, so you refcount on the original value, which allows you to skip fixing up all tasks. I missed that bit. > Do you think that could be acceptable? Think so, it's a sysctl poke, 'nobody' ever does that.