On 2019.07.31 23:17 Viresh Kumar wrote: > On 31-07-19, 17:20, Doug Smythies wrote: >> Summary: >> >> The old way, using UINT_MAX had two purposes: first, >> as a "need to do a frequency update" flag; but also second, to >> force any subsequent old/new frequency comparison to NOT be "the same, >> so why bother actually updating" (see: sugov_update_next_freq). All >> patches so far have been dealing with the flag, but only partially >> the comparisons. In a busy system, and when schedutil.c doesn't actually >> know the currently set system limits, the new frequency is dominated by >> values the same as the old frequency. So, when sugov_fast_switch calls >> sugov_update_next_freq, false is usually returned. > > And finally we know "Why" :) > > Good work Doug. Thanks for taking it to the end. > >> However, if we move the resetting of the flag and add another condition >> to the "no need to actually update" decision, then perhaps this patch >> version 1 will be O.K. It seems to be. (see way later in this e-mail). > >> With all this new knowledge, how about going back to >> version 1 of this patch, and then adding this: >> >> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c >> index 808d32b..f9156db 100644 >> --- a/kernel/sched/cpufreq_schedutil.c >> +++ b/kernel/sched/cpufreq_schedutil.c >> @@ -100,7 +100,12 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time) >> static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time, >> unsigned int next_freq) >> { >> - if (sg_policy->next_freq == next_freq) >> + /* >> + * Always force an update if the flag is set, regardless. >> + * In some implementations (intel_cpufreq) the frequency is clamped >> + * further downstream, and might not actually be different here. >> + */ >> + if (sg_policy->next_freq == next_freq && !sg_policy->need_freq_update) >> return false; > > This is not correct because this is an optimization we have in place > to make things more efficient. And it was working by luck earlier and > my patch broke it for good :) Disagree. All I did was use a flag where it used to be set to UNIT_MAX, to basically implement the same thing. > Things need to get a bit more synchronized and something like this may > help (completely untested): > > diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c > index cc27d4c59dca..2d84361fbebc 100644 > --- a/drivers/cpufreq/intel_pstate.c > +++ b/drivers/cpufreq/intel_pstate.c > @@ -2314,6 +2314,18 @@ static int intel_cpufreq_target(struct cpufreq_policy *policy, > return 0; > } > > +static unsigned int intel_cpufreq_resolve_freq(struct cpufreq_policy *policy, > + unsigned int target_freq) > +{ > + struct cpudata *cpu = all_cpu_data[policy->cpu]; > + int target_pstate; > + > + target_pstate = DIV_ROUND_UP(target_freq, cpu->pstate.scaling); > + target_pstate = intel_pstate_prepare_request(cpu, target_pstate); > + > + return target_pstate * cpu->pstate.scaling; > +} > + > static unsigned int intel_cpufreq_fast_switch(struct cpufreq_policy *policy, > unsigned int target_freq) > { > @@ -2350,6 +2362,7 @@ static struct cpufreq_driver intel_cpufreq = { > .verify = intel_cpufreq_verify_policy, > .target = intel_cpufreq_target, > .fast_switch = intel_cpufreq_fast_switch, > + .resolve_freq = intel_cpufreq_resolve_freq, > .init = intel_cpufreq_cpu_init, > .exit = intel_pstate_cpu_exit, > .stop_cpu = intel_cpufreq_stop_cpu, > > -------------------------8<------------------------- > > Please try this with my patch 2. O.K. > We need patch 2 instead of 1 because > of another race condition Rafael noticed. Disagree. Notice that my modifications to your patch1 addresses that condition by moving the clearing of "need_freq_update" to sometime later. > > cpufreq_schedutil calls driver specific resolve_freq() to find the new > target frequency and this is where the limits should get applied IMO. Oh! I didn't know. But yes, that makes sense. > > Rafael can help with reviewing this diff but it would be great if you > can give this a try Doug. Anyway, I added the above code (I am calling it patch3) to patch2, as you asked, and it does work. I also added it to my modified patch1, additionally removing the extra condition check that I added (i.e. all that remains of my patch1 modifications is the moved clearing of "need_freq_update") That kernel also worked for both intel_cpufreq/schedutil and acpi-cpufreq/schedutil. Again, I do not know how to test the original issue that led to the change away from UINT_MAX in the first place, ecd2884291261e3fddbc7651ee11a20d596bb514, which should be tested in case of some introduced regression. ... Doug