On Mon, Feb 28, 2022 at 08:36:03PM +0100, Rafael J. Wysocki wrote: > On Monday, February 28, 2022 5:12:28 AM CET Feng Tang wrote: > > On Fri, Feb 25, 2022 at 04:36:53PM -0800, Doug Smythies wrote: > > > On Fri, Feb 25, 2022 at 9:46 AM Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote: > > > > > > > > On Thursday, February 24, 2022 9:08:30 AM CET Feng Tang wrote: > > > ... > > > > > > So it looks like a new mechanism is needed for that. > > > > > > > > > > If you think idle class is not the right place to solve it, I can > > > > > also help testing new patches. > > > > > > > > So I have the appended experimental patch to address this issue that's not > > > > been tested at all. Caveat emptor. > > > > > > Hi Rafael, > > > > > > O.K., you gave fair warning. > > > > > > The patch applied fine. > > > It does not compile for me. > > > The function cpuidle_update_retain_tick does not exist. > > > Shouldn't it be somewhere in cpuidle.c? > > > I used the function cpuidle_disable_device as a template > > > for searching and comparing. > > > > > > Because all of my baseline results are with kernel 5.17-rc3, > > > that is what I am still using. > > > > > > Error: > > > ld: drivers/cpufreq/intel_pstate.o: in function `intel_pstate_update_perf_ctl': > > > intel_pstate.c:(.text+0x2520): undefined reference to > > > `cpuidle_update_retain_tick' > > > > Same here, seems the cpuidle_update_retain_tick()'s implementation > > is missing. > > That's a patch generation issue on my part, sorry. > > However, it was a bit racy, so maybe it's good that it was not complete. > > Below is a new version. Thanks for the new version. I just gave it a try, and the occasional long delay of cpufreq auto-adjusting I have seen can not be reproduced after applying it. As my test is a rough one which can't reproduce what Doug has seen (including the power meter data), it's better to wait for his test result. And one minor question for the code. > --- > drivers/cpufreq/intel_pstate.c | 40 ++++++++++++++++++++++++++++--------- > drivers/cpuidle/governor.c | 23 +++++++++++++++++++++ > drivers/cpuidle/governors/ladder.c | 6 +++-- > drivers/cpuidle/governors/menu.c | 2 + > drivers/cpuidle/governors/teo.c | 3 ++ > include/linux/cpuidle.h | 4 +++ > 6 files changed, 67 insertions(+), 11 deletions(-) > [SNIP] > Index: linux-pm/drivers/cpufreq/intel_pstate.c > =================================================================== > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c > +++ linux-pm/drivers/cpufreq/intel_pstate.c > @@ -19,6 +19,7 @@ > #include <linux/list.h> > #include <linux/cpu.h> > #include <linux/cpufreq.h> > +#include <linux/cpuidle.h> > #include <linux/sysfs.h> > #include <linux/types.h> > #include <linux/fs.h> > @@ -1970,6 +1971,30 @@ static inline void intel_pstate_cppc_set > } > #endif /* CONFIG_ACPI_CPPC_LIB */ > > +static void intel_pstate_update_perf_ctl(struct cpudata *cpu) > +{ > + int pstate = cpu->pstate.current_pstate; > + > + /* > + * Avoid stopping the scheduler tick from cpuidle on CPUs in turbo > + * P-states to prevent them from getting back to the high frequency > + * right away after getting out of deep idle. > + */ > + cpuidle_update_retain_tick(pstate > cpu->pstate.max_pstate); In our test, the workload will make CPU go to highest frequency before going to idle, but should we also consider that the case that the high but not highest cupfreq? Thanks, Feng