Re: [PATCH v6 04/16] sched/core: uclamp: Add CPU's clamp buckets refcounting

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Tue, 22 Jan 2019 10:45:07 +0100

On Mon, Jan 21, 2019 at 04:33:38PM +0000, Patrick Bellasi wrote:
> On 21-Jan 17:12, Peter Zijlstra wrote:
> > On Mon, Jan 21, 2019 at 03:23:11PM +0000, Patrick Bellasi wrote:

> > > and keep all
> > > the buckets in use at the beginning of a cache line.
> > 
> > That; is that the rationale for all this? Note that per the defaults
> > everything is in a single line already.
> 
> Yes, that's because of the loop in:
> 
>    dequeue_task()
>      uclamp_cpu_dec()
>        uclamp_cpu_dec_id()
>          uclamp_cpu_update()
> 
> where buckets needs sometimes to be scanned to find a new max.
> 
> Consider also that, with mapping, we can more easily increase the
> buckets count to 20 in order to have a finer clamping granularity if
> needed without warring too much about performance impact especially
> when we use anyway few different clamp values.
> 
> So, I agree that mapping adds (code) complexity but it can also save
> few cycles in the fast path... do you think it's not worth the added
> complexity?

Then maybe split this out in a separate patch? Do the trivial linear
bucket thing first and then do this smarty pants thing on top.

One problem with the scheme is that it doesn't defrag; so if you get a
peak usage, you can still end up with only two active buckets in
different lines.

Also; if it is it's own patch, you get a much better view of the
additional complexity and a chance to justify it ;-)

Also; would it make sense to do s/cpu/rq/ on much of this? All this
uclamp_cpu_*() stuff really is per rq and takes rq arguments, so why
does it have cpu in the name... no strong feelings, just noticed it and
thought is a tad inconsistent.