On Tue, Aug 1, 2023 at 2:45 AM Quentin Perret <qperret@xxxxxxxxxx> wrote: > > Hi David, > > On Monday 31 Jul 2023 at 10:46:09 (-0700), David Dai wrote: > > +static unsigned int virt_cpufreq_set_perf(struct cpufreq_policy *policy) > > +{ > > + struct virt_cpufreq_drv_data *data = policy->driver_data; > > + /* > > + * Use cached frequency to avoid rounding to freq table entries > > + * and undo 25% frequency boost applied by schedutil. > > + */ > > The VMM would be a better place for this scaling I think, the driver > can't/shouldn't make assumptions about the governor it is running with > given that this is a guest userspace decision essentially. > > IIRC the fast_switch() path is only used by schedutil, so one could > probably make a case to scale things there, but it'd be inconsistent > with the "slow" switch case, and would create a fragile dependency, so > it's probably not worth pursuing. Thanks for the input Quentin! David and I spend several hours over several days discussing this. We were trying to think through and decide if we were really removing the 25% margin applied by the guest side schedutil or the host side schedutil. We ran through different thought experiments on what would happen if the guest used ondemand/conservative/performance/powersave governors and what if in the future we had a configurable schedutil margin. We changed our opinions multiple times until we finally remembered this goal from my original presentation[1]: "On an idle host, running the use case in the host vs VM, should result in close to identical DVFS behavior of the physical CPUs and CPU selection for the threads." For that statement to be true when the guest uses ondemand/conservative governor, we have to remove the 25% margin applied by the host side schedutil governor. Otherwise, running the workload on the VM will result in frequencies 25% higher than running the same load on the host with ondemand/conservative governor. So, we finally concluded that we are really undoing the host side schedutil margin. And in that case, it makes sense to undo this in the VMM side. So, we'll go with your suggestion in this email instead of making the schedutil margin to be 0 for the guest. [1] - https://lpc.events/event/16/contributions/1195/attachments/970/1893/LPC%202022%20-%20VM%20DVFS.pdf Thanks, Saravana > > > + u32 freq = mult_frac(policy->cached_target_freq, 80, 100); > > + > > + data->ops->set_freq(policy, freq); > > + return 0; > > +} > > Thanks, > Quentin