On Tue, Jul 23, 2019 at 11:15 AM Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote: > > On 23-07-19, 00:10, Doug Smythies wrote: > > On 2019.07.21 23:52 Viresh Kumar wrote: > > > > > To avoid reducing the frequency of a CPU prematurely, we skip reducing > > > the frequency if the CPU had been busy recently. > > > > > > This should not be done when the limits of the policy are changed, for > > > example due to thermal throttling. We should always get the frequency > > > within limits as soon as possible. > > > > > > Fixes: ecd288429126 ("cpufreq: schedutil: Don't set next_freq to UINT_MAX") > > > Cc: v4.18+ <stable@xxxxxxxxxxxxxxx> # v4.18+ > > > Reported-by: Doug Smythies <doug.smythies@xxxxxxxxx> > > > Signed-off-by: Viresh Kumar <viresh.kumar@xxxxxxxxxx> > > > --- > > > @Doug: Please try this patch, it must fix the issue you reported. > > > > It fixes the driver = acpi-cpufreq ; governor = schedutil test case > > It does not fix the driver = intel_cpufreq ; governor = schedutil test case > > > > I have checked my results twice, but will check again in the day or two. > > The patch you tried to revert wasn't doing any driver specific stuff > but only schedutil. If that revert fixes your issue with both the > drivers, then this patch should do it as well. > > I am clueless now on what can go wrong with intel_cpufreq driver with > schedutil now. > > Though there is one difference between intel_cpufreq and acpi_cpufreq, > intel_cpufreq has fast_switch_possible=true and so it uses slightly > different path in schedutil. I tried to look from that perspective as > well but couldn't find anything wrong. acpi-cpufreq should use fast switching on the Doug's system too. > If you still find intel_cpufreq to be broken, even with this patch, > please set fast_switch_possible=false instead of true in > __intel_pstate_cpu_init() and try tests again. That shall make it very > much similar to acpi-cpufreq driver. I wonder if this helps. Even so, we want fast switching to be used by intel_cpufreq. Anyway, it looks like the change reverted by the Doug's patch introduced a race condition that had not been present before. Namely, need_freq_update is cleared in get_next_freq() when it is set _or_ when the new freq is different from the cached one, so in the latter case if it happens to be set by sugov_limits() after evaluating sugov_should_update_freq() (which returned 'true' for timing reasons), that update will be lost now. [Previously the update would not be lost, because the clearing of need_freq_update depended only on its current value.] Where it matters is that in the "need_freq_update set" case, the "premature frequency reduction avoidance" should not be applied (as you noticed and hence the $subject patch). However, even with the $subject patch, need_freq_update may still be set by sugov_limits() after the check added by it and then cleared by get_next_freq(), so it doesn't really eliminate the problem. IMO eliminating would require invalidating next_freq this way or another when need_freq_update is set in sugov_should_update_freq(), which was done before commit ecd2884291261e3fddbc7651ee11a20d596bb514.