On Wed, 3 Jun 2020 at 18:52, Qais Yousef <qais.yousef@xxxxxxx> wrote: > > On 06/03/20 16:59, Vincent Guittot wrote: > > When I want to stress the fast path i usually use "perf bench sched pipe -T " > > The tip/sched/core on my arm octo core gives the following results for > > 20 iterations of perf bench sched pipe -T -l 50000 > > > > all uclamp config disabled 50035.4(+/- 0.334%) > > all uclamp config enabled 48749.8(+/- 0.339%) -2.64% > > > > It's quite easy to reproduce and probably easier to study the impact > > Thanks Vincent. This is very useful! > > I could reproduce that on my Juno. > > One of the codepath I was suspecting seems to affect it. > > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 0464569f26a7..9f48090eb926 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -1063,10 +1063,12 @@ static inline void uclamp_rq_dec_id(struct rq *rq, struct task_struct *p, > * e.g. due to future modification, warn and fixup the expected value. > */ > SCHED_WARN_ON(bucket->value > rq_clamp); > +#if 0 > if (bucket->value >= rq_clamp) { > bkt_clamp = uclamp_rq_max_value(rq, clamp_id, uc_se->value); > WRITE_ONCE(uc_rq->value, bkt_clamp); > } > +#endif > } > > static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) > > > > uclamp_rq_max_value() could be expensive as it loops over all buckets. > Commenting this whole path out strangely doesn't just 'fix' it, but produces > better results to no-uclamp kernel :-/ > > > > # ./perf bench -r 20 sched pipe -T -l 50000 > Without uclamp: 5039 > With uclamp: 4832 > With uclamp+patch: 5729 > > > > It might be because schedutil gets biased differently by uclamp..? If I move to > performance governor these numbers almost double. > > I don't know. But this promoted me to look closer and I think I spotted a bug > where in the if condition we check for '>=' instead of '>', causing us to take > the supposedly impossible fail safe path. > > Mind trying with the below patch please? I have tried your patch and I don't see any difference compared to previous tests. Let me give you more details of my setup: I create 3 levels of cgroups and usually run the tests in the 4 levels (which includes root). The result above are for the root level But I see a difference at other levels: root level 1 level 2 level 3 /w patch uclamp disable 50097 46615 43806 41078 tip uclamp enable 48706(-2.78%) 45583(-2.21%) 42851(-2.18%) 40313(-1.86%) /w patch uclamp enable 48882(-2.43%) 45774(-1.80%) 43108(-1.59%) 40667(-1.00%) Whereas tip with uclamp stays around 2% behind tip without uclamp, the diff of uclamp with your patch tends to decrease when we increase the number of level Beside this, that's also interesting to notice the ~6% of perf impact between each level for the same image > > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 0464569f26a7..50d66d4016ff 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -1063,7 +1063,7 @@ static inline void uclamp_rq_dec_id(struct rq *rq, struct task_struct *p, > * e.g. due to future modification, warn and fixup the expected value. > */ > SCHED_WARN_ON(bucket->value > rq_clamp); > - if (bucket->value >= rq_clamp) { > + if (bucket->value > rq_clamp) { > bkt_clamp = uclamp_rq_max_value(rq, clamp_id, uc_se->value); > WRITE_ONCE(uc_rq->value, bkt_clamp); > } > > > > Thanks > > -- > Qais Yousef