On Wed, Nov 17, 2021 at 06:59:05PM +0100, Rafael J. Wysocki wrote: > On Wed, Nov 17, 2021 at 6:01 PM Thara Gopinath > <thara.gopinath@xxxxxxxxxx> wrote: > > > > Hi, > > > > On 11/17/21 7:49 AM, Rafael J. Wysocki wrote: > > > On Wed, Nov 17, 2021 at 11:46 AM Lukasz Luba <lukasz.luba@xxxxxxx> wrote: > > >> > > >> Hi Rafael, > > >> > > >> On 11/16/21 7:05 PM, Rafael J. Wysocki wrote: > > >>> On Mon, Nov 15, 2021 at 9:10 PM Thara Gopinath > > >>> <thara.gopinath@xxxxxxxxxx> wrote: > > >>>> > > >>>> cpuinfo.max_freq can reflect boost frequency if enabled during boot. Since > > >>>> we don't consider boost frequencies while calculating cpu capacities, use > > >>>> policy->max to populate the freq_factor during boot up. > > >>> > > >>> I'm not sure about this. schedutil uses cpuinfo.max_freq as the max frequency. > > >> > > >> Agree it's tricky how we treat the boost frequencies and also combine > > >> them with thermal pressure. > > >> We probably would have consider these design bits: > > >> 1. Should thermal pressure include boost frequency? > > > > > > Well, I guess so. > > > > > > Running at a boost frequency certainly increases thermal pressure. > > > > > >> 2. Should max capacity 1024 be a boost frequency so scheduler > > >> would see it explicitly? > > > > > > That's what it is now if cpuinfo.max_freq is a boost frequency. > > > > > >> - if no, then schedutil could still request boost freq thanks to > > >> map_util_perf() where we add 25% to the util and then > > >> map_util_freq() would return a boost freq when util was > 1024 > > >> > > >> > > >> I can see in schedutil only one place when cpuinfo.max_freq is used: > > >> get_next_freq(). If the value stored in there is a boost, > > >> then don't we get a higher freq value for the same util? > > > > > > Yes. we do, which basically is my point. > > > > > > The schedutil's response is proportional to cpuinfo.max_freq and that > > > needs to be taken into account for the results to be consistent. > > > > So IIUC, cpuinfo.max_freq is always supposed to be the highest supported > > frequency of a cpu, irrespective of whether boost is enabled or not. > > Where as policy->max is the currently available maximum cpu frequency > > which can be equal to cpuinfo.max_freq or lower (depending on whether > > boost is enabled, whether there is a constraint on policy->max placed by > > thermal etc). > > It may also depend on the limit set by user space. > > > So in this case isn't it better for schedutil to consider > > policy->max instead of cpuinfo.max ? > > Not really. > > In that case setting policy->max to 1/2 of cpuinfo.max_freq would > cause schedutil to choose 1/4 of cpuinfo.max_freq for 50% utilization > which would be rather unexpected. > > policy->max is a cap, not the current maximum capacity. > > > Like you mentioned above same > > utilization will relate to different frequencies depending on the > > maximum frequency. > > Which is not how it is expected (and defined) to work, though. > > If you really want to play with the current maximum capacity, you need > to change it whenever boost is disabled or enabled - and there is a > mechanism for updating cpufinfo.max_freq in such cases. I don't see why we would want to change max capacity on the fly. It is not a cheap operation as we would need to normalize the capacity for all CPUs if the CPU(s) with capacity = 1024 changes its capacity. Worst case we even have to rebuild the sched_domain hierarchy to update flags. The update would also temporarily mess with load and utilization signals, so not a cheap operation. Morten