On 01-08-19, 09:47, Rafael J. Wysocki wrote: > On Thu, Aug 1, 2019 at 8:17 AM Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote: > > > > On 31-07-19, 17:20, Doug Smythies wrote: > > > Hi Viresh, > > > > > > Summary: > > > > > > The old way, using UINT_MAX had two purposes: first, > > > as a "need to do a frequency update" flag; but also second, to > > > force any subsequent old/new frequency comparison to NOT be "the same, > > > so why bother actually updating" (see: sugov_update_next_freq). All > > > patches so far have been dealing with the flag, but only partially > > > the comparisons. In a busy system, and when schedutil.c doesn't actually > > > know the currently set system limits, the new frequency is dominated by > > > values the same as the old frequency. So, when sugov_fast_switch calls > > > sugov_update_next_freq, false is usually returned. > > > > And finally we know "Why" :) > > > > Good work Doug. Thanks for taking it to the end. > > > > > However, if we move the resetting of the flag and add another condition > > > to the "no need to actually update" decision, then perhaps this patch > > > version 1 will be O.K. It seems to be. (see way later in this e-mail). > > > > > With all this new knowledge, how about going back to > > > version 1 of this patch, and then adding this: > > > > > > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c > > > index 808d32b..f9156db 100644 > > > --- a/kernel/sched/cpufreq_schedutil.c > > > +++ b/kernel/sched/cpufreq_schedutil.c > > > @@ -100,7 +100,12 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time) > > > static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time, > > > unsigned int next_freq) > > > { > > > - if (sg_policy->next_freq == next_freq) > > > + /* > > > + * Always force an update if the flag is set, regardless. > > > + * In some implementations (intel_cpufreq) the frequency is clamped > > > + * further downstream, and might not actually be different here. > > > + */ > > > + if (sg_policy->next_freq == next_freq && !sg_policy->need_freq_update) > > > return false; > > > > This is not correct because this is an optimization we have in place > > to make things more efficient. And it was working by luck earlier and > > my patch broke it for good :) > > OK, so since we know why it was wrong now, why don't we just revert > it? Plus maybe add some comment explaining the rationale in there? Because the patch [1] which caused these issues was almost correct, just that it missed the busy accounting for single CPU case. The main idea behind the original patch [1] was to avoid any unwanted/hidden side-affects by overriding the value of next_freq. What we see above is exactly the case for that. Because we override the value of next_freq, we made intel-pstate work by chance, unintentionally. Which is wrong. And who knows what other side affects it had, we already found two (this one and the one fixed by [1]). I would strongly suggest that we don't override the value of next_freq with special meaning, as it is used at so many places we don't know what it may result in. -- viresh [1] ecd288429126 cpufreq: schedutil: Don't set next_freq to UINT_MAX