On Thu, Jan 09, 2020 at 01:00:58PM +0000, Qais Yousef wrote: > On 01/08/20 14:44, Peter Zijlstra wrote: > > On Fri, Dec 20, 2019 at 04:48:38PM +0000, Qais Yousef wrote: > > > RT tasks by default try to run at the highest capacity/performance > > > level. When uclamp is selected this default behavior is retained by > > > enforcing the uclamp_util_min of the RT tasks to be > > > uclamp_none(UCLAMP_MAX), which is SCHED_CAPACITY_SCALE; the maximum > > > value. > > > > > > See commit 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks"). > > > > > > On battery powered devices, this default behavior could consume more > > > power, and it is desired to be able to tune it down. While uclamp allows > > > tuning this by changing the uclamp_util_min of the individual tasks, but > > > this is cumbersome and error prone. > > > > > > To control the default behavior globally by system admins and device > > > integrators, introduce the new sysctl_sched_rt_uclamp_util_min to > > > change the default uclamp_util_min value of the RT tasks. > > > > > > Whenever the new default changes, it'd be applied on the next wakeup of > > > the RT task, assuming that it still uses the system default value and > > > not a user applied one. > > > > This is because these RT tasks are not in a cgroup or not affected by > > cgroup settings? I feel the justification is a little thin here. > > The uclamp_min for RT tasks is always hardcoded to 1024 at the moment. So even > if they belong to a cgroup->uclamp_min = 0, they'll still run at max frequency, > no? Argh, this is that counter intuitive max aggregate nonsense biting me.