On Tue, Mar 1, 2022 at 6:53 AM Feng Tang <feng.tang@xxxxxxxxx> wrote: > > On Mon, Feb 28, 2022 at 08:36:03PM +0100, Rafael J. Wysocki wrote: > > On Monday, February 28, 2022 5:12:28 AM CET Feng Tang wrote: > > > On Fri, Feb 25, 2022 at 04:36:53PM -0800, Doug Smythies wrote: > > > > On Fri, Feb 25, 2022 at 9:46 AM Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote: > > > > > > > > > > On Thursday, February 24, 2022 9:08:30 AM CET Feng Tang wrote: > > > > ... > > > > > > > So it looks like a new mechanism is needed for that. > > > > > > > > > > > > If you think idle class is not the right place to solve it, I can > > > > > > also help testing new patches. > > > > > > > > > > So I have the appended experimental patch to address this issue that's not > > > > > been tested at all. Caveat emptor. > > > > > > > > Hi Rafael, > > > > > > > > O.K., you gave fair warning. > > > > > > > > The patch applied fine. > > > > It does not compile for me. > > > > The function cpuidle_update_retain_tick does not exist. > > > > Shouldn't it be somewhere in cpuidle.c? > > > > I used the function cpuidle_disable_device as a template > > > > for searching and comparing. > > > > > > > > Because all of my baseline results are with kernel 5.17-rc3, > > > > that is what I am still using. > > > > > > > > Error: > > > > ld: drivers/cpufreq/intel_pstate.o: in function `intel_pstate_update_perf_ctl': > > > > intel_pstate.c:(.text+0x2520): undefined reference to > > > > `cpuidle_update_retain_tick' > > > > > > Same here, seems the cpuidle_update_retain_tick()'s implementation > > > is missing. > > > > That's a patch generation issue on my part, sorry. > > > > However, it was a bit racy, so maybe it's good that it was not complete. > > > > Below is a new version. > > Thanks for the new version. I just gave it a try, and the occasional > long delay of cpufreq auto-adjusting I have seen can not be reproduced > after applying it. OK, thanks! I'll wait for feedback from Dough, though. > As my test is a rough one which can't reproduce what Doug has seen > (including the power meter data), it's better to wait for his test result. > > And one minor question for the code. > > > --- > > drivers/cpufreq/intel_pstate.c | 40 ++++++++++++++++++++++++++++--------- > > drivers/cpuidle/governor.c | 23 +++++++++++++++++++++ > > drivers/cpuidle/governors/ladder.c | 6 +++-- > > drivers/cpuidle/governors/menu.c | 2 + > > drivers/cpuidle/governors/teo.c | 3 ++ > > include/linux/cpuidle.h | 4 +++ > > 6 files changed, 67 insertions(+), 11 deletions(-) > > > [SNIP] > > > Index: linux-pm/drivers/cpufreq/intel_pstate.c > > =================================================================== > > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c > > +++ linux-pm/drivers/cpufreq/intel_pstate.c > > @@ -19,6 +19,7 @@ > > #include <linux/list.h> > > #include <linux/cpu.h> > > #include <linux/cpufreq.h> > > +#include <linux/cpuidle.h> > > #include <linux/sysfs.h> > > #include <linux/types.h> > > #include <linux/fs.h> > > @@ -1970,6 +1971,30 @@ static inline void intel_pstate_cppc_set > > } > > #endif /* CONFIG_ACPI_CPPC_LIB */ > > > > +static void intel_pstate_update_perf_ctl(struct cpudata *cpu) > > +{ > > + int pstate = cpu->pstate.current_pstate; > > + > > + /* > > + * Avoid stopping the scheduler tick from cpuidle on CPUs in turbo > > + * P-states to prevent them from getting back to the high frequency > > + * right away after getting out of deep idle. > > + */ > > + cpuidle_update_retain_tick(pstate > cpu->pstate.max_pstate); > > In our test, the workload will make CPU go to highest frequency before > going to idle, but should we also consider that the case that the > high but not highest cupfreq? This covers the entire turbo range (max_pstate is the highest non-turbo one).