On 21/04/2020 13:27, Qais Yousef wrote: > On 04/21/20 13:18, Dietmar Eggemann wrote: >> On 20/04/2020 17:13, Qais Yousef wrote: >>> On 04/20/20 10:29, Dietmar Eggemann wrote: >>>> On 03.04.20 14:30, Qais Yousef wrote: >>>> >>>> [...] >>>> >>>>> @@ -924,6 +945,14 @@ uclamp_eff_get(struct task_struct *p, enum uclamp_id clamp_id) >>>>> return uc_req; >>>>> } >>>>> >>>>> +static void uclamp_rt_sync_default_util_min(struct task_struct *p) >>>>> +{ >>>>> + struct uclamp_se *uc_se = &p->uclamp_req[UCLAMP_MIN]; >>>>> + >>>>> + if (!uc_se->user_defined) >>>>> + uclamp_se_set(uc_se, sysctl_sched_rt_default_uclamp_util_min, false); >>>>> +} >>>>> + >>>>> unsigned long uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id) >>>>> { >>>>> struct uclamp_se uc_eff; >>>>> @@ -1030,6 +1059,12 @@ static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) >>>>> if (unlikely(!p->sched_class->uclamp_enabled)) >>>>> return; >>>>> >>>>> + /* >>>>> + * When sysctl_sched_rt_default_uclamp_util_min value is changed by the >>>>> + * user, we apply any new value on the next wakeup, which is here. >>>>> + */ >>>>> + uclamp_rt_sync_default_util_min(p); >>>>> + >>>> >>>> Does this have to be an extra function? Can we not reuse >>>> uclamp_tg_restrict() by slightly rename it to uclamp_restrict()? Btw, there was an issue in my little snippet. I used uc_req.user_defined uninitialized in uclamp_restrict(). diff --git a/kernel/sched/core.c b/kernel/sched/core.c index f3706dad32ce..7e6b2b7cd1e5 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -903,12 +903,11 @@ uclamp_restrict(struct task_struct *p, enum uclamp_id clamp_id) { struct uclamp_se uc_req, __maybe_unused uc_max; - if (unlikely(rt_task(p)) && clamp_id == UCLAMP_MIN && - !uc_req.user_defined) { + if (unlikely(rt_task(p)) && clamp_id == UCLAMP_MIN) { struct uclamp_se *uc_se = &p->uclamp_req[UCLAMP_MIN]; int rt_min = sysctl_sched_rt_default_uclamp_util_min; - if (uc_se->value != rt_min) { + if (!uc_se->user_defined && uc_se->value != rt_min) { uclamp_se_set(uc_se, rt_min, false); printk("uclamp_restrict() [%s %d] p->uclamp_req[%d].value=%d\n", p->comm, p->pid, clamp_id, uc_se->value); >>> Hmm the thing is that we're not restricting here. In contrary we're boosting, >>> so the name would be misleading. >> >> I always thought that we're restricting p->uclamp_req[UCLAMP_MIN].value (default 1024) to >> sysctl_sched_rt_default_uclamp_util_min (0-1024)? > > The way I look at it is that we're *setting* it to > sysctl_sched_rt_default_uclamp_util_min if !user_defined. > > The restriction mechanism that ensures this set value doesn't escape > cgroup/global restrictions setup. I guess we overall agree here. I see 3 restriction levels: (!user_defined) task -> taskgroup -> system I see sysctl_sched_rt_default_uclamp_util_min (min_rt_default) as a restriction on task level. It's true that the task level restriction is setting the value at the same time. For CFS (id=UCLAMP_[MIN\|MAX]) and RT (id=UCLAMP_MAX) we use uclamp_none(id) and those values (0, 1024) are fixed so these task level values don't need to be further restricted. For RT (id=UCLAMP_MIN) we use 'min_rt_default' and since it can change we have to check the task level restriction in 'uclamp_eff_get() -> uclamp_(tg)_restrict()'. root@h960:~# echo 999 > /proc/sys/kernel/sched_rt_default_util_clamp_min [ 2540.507236] uclamp_eff_get() [rtkit-daemon 419] tag=0 uclamp_id=0 uc_req.value=1024 [ 2540.514947] uclamp_eff_get() [rtkit-daemon 419] tag=1 uclamp_id=0 uc_req.value=1024 [ 2548.015208] uclamp_restrict() [rtkit-daemon 419] p->uclamp_req[0].value=999 root@h960:~# echo 666 > /proc/sys/kernel/sched_util_clamp_min [ 2548.022219] uclamp_eff_get() [rtkit-daemon 419] tag=0 uclamp_id=0 uc_req.value=999 [ 2548.029825] uclamp_eff_get() [rtkit-daemon 419] tag=1 uclamp_id=0 uc_req.value=999 [ 2553.479509] uclamp_eff_get() [rtkit-daemon 419] tag=0 uclamp_id=0 uc_max.value=666 [ 2553.487131] uclamp_eff_get() [rtkit-daemon 419] tag=1 uclamp_id=0 uc_max.value=666 Haven't tried to put an rt task into a taskgroup other than root.