Mihai Carabas <mihai.carabas@xxxxxxxxxx> writes: >>>>> cpu_relax on ARM64 does a simple "yield". Thus we replace it with >>>>> smp_cond_load_relaxed which basically does a "wfe". >>>>> >>>>> Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx> >>>>> Signed-off-by: Mihai Carabas <mihai.carabas@xxxxxxxxxx> >>>>> --- >>>>> drivers/cpuidle/poll_state.c | 14 +++++++++----- >>>>> 1 file changed, 9 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c >>>>> index 9b6d90a72601..440cd713e39a 100644 >>>>> --- a/drivers/cpuidle/poll_state.c >>>>> +++ b/drivers/cpuidle/poll_state.c >>>>> @@ -26,12 +26,16 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev, >>>>> limit = cpuidle_poll_time(drv, dev); >>>>> - while (!need_resched()) { >>>>> - cpu_relax(); >>>>> - if (loop_count++ < POLL_IDLE_RELAX_COUNT) >>>>> - continue; >>>>> - >>>>> + for (;;) { >>>>> loop_count = 0; >>>>> + >>>>> + smp_cond_load_relaxed(¤t_thread_info()->flags, >>>>> + (VAL & _TIF_NEED_RESCHED) || >>>>> + (loop_count++ >= POLL_IDLE_RELAX_COUNT)); >>>>> + >>>>> + if (loop_count < POLL_IDLE_RELAX_COUNT) >>>>> + break; >>>>> + >>>>> if (local_clock_noinstr() - time_start > limit) { >>>>> dev->poll_time_limit = true; >>>>> break; >>>> Doesn't this make ARCH_HAS_CPU_RELAX a complete misnomer? >>> This controls the build of poll_state.c and the generic definition of >>> smp_cond_load_relaxed (used by x86) is using cpu_relax(). Do you propose >>> other approach here? >> Give it a better name? Having ARCH_HAS_CPU_RELAX control a piece of code >> that doesn't use cpu_relax() doesn't make sense to me. > > The generic code for smp_cond_load_relaxed is using cpu_relax and this one is > used on x86 - so ARCH_HAS_CPU_RELAX is a prerequisite on x86 when using > haltpoll. Only on ARM64 this is overwritten. Moreover ARCH_HAS_CPU_RELAX is > controlling the function definition for cpuidle_poll_state_init (this is how it > was originally designed). I suspect Will's point is that the term ARCH_HAS_CPU_RELAX doesn't make a whole lot of sense when we are only indirectly using cpu_relax() in the series. Also, all archs define cpu_relax() (though some as just a barrier()) so ARCH_HAS_CPU_RELAX . Maybe an arch can instead just opt into polling in idle? Perhaps something like this trivial patch: -- diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5edec175b9bf..d80c98c64fd4 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -367,7 +367,7 @@ config ARCH_MAY_HAVE_PC_FDC config GENERIC_CALIBRATE_DELAY def_bool y -config ARCH_HAS_CPU_RELAX +config ARCH_WANTS_IDLE_POLL def_bool y config ARCH_HIBERNATION_POSSIBLE diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index 55437f5e0c3a..6a0a1f16a5c3 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -36,7 +36,7 @@ #include <asm/cpu.h> #endif -#define ACPI_IDLE_STATE_START (IS_ENABLED(CONFIG_ARCH_HAS_CPU_RELAX) ? 1 : 0) +#define ACPI_IDLE_STATE_START (IS_ENABLED(CONFIG_ARCH_WANTS_IDLE_POLL) ? 1 : 0) static unsigned int max_cstate __read_mostly = ACPI_PROCESSOR_MAX_POWER; module_param(max_cstate, uint, 0400); @@ -787,7 +787,7 @@ static int acpi_processor_setup_cstates(struct acpi_processor *pr) if (max_cstate == 0) max_cstate = 1; - if (IS_ENABLED(CONFIG_ARCH_HAS_CPU_RELAX)) { + if (IS_ENABLED(CONFIG_ARCH_WANTS_IDLE_POLL)) { cpuidle_poll_state_init(drv); count = 1; } else { diff --git a/drivers/cpuidle/Makefile b/drivers/cpuidle/Makefile index d103342b7cfc..23f48d99f0f2 100644 --- a/drivers/cpuidle/Makefile +++ b/drivers/cpuidle/Makefile @@ -7,7 +7,7 @@ obj-y += cpuidle.o driver.o governor.o sysfs.o governors/ obj-$(CONFIG_ARCH_NEEDS_CPU_IDLE_COUPLED) += coupled.o obj-$(CONFIG_DT_IDLE_STATES) += dt_idle_states.o obj-$(CONFIG_DT_IDLE_GENPD) += dt_idle_genpd.o -obj-$(CONFIG_ARCH_HAS_CPU_RELAX) += poll_state.o +obj-$(CONFIG_ARCH_WANTS_IDLE_POLL) += poll_state.o obj-$(CONFIG_HALTPOLL_CPUIDLE) += cpuidle-haltpoll.o ################################################################################## diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index 3183aeb7f5b4..53e55a91d55d 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -275,7 +275,7 @@ static inline void cpuidle_coupled_parallel_barrier(struct cpuidle_device *dev, } #endif -#if defined(CONFIG_CPU_IDLE) && defined(CONFIG_ARCH_HAS_CPU_RELAX) +#if defined(CONFIG_CPU_IDLE) && defined(CONFIG_ARCH_WANTS_IDLE_POLL) void cpuidle_poll_state_init(struct cpuidle_driver *drv); #else static inline void cpuidle_poll_state_init(struct cpuidle_driver *drv) {}