On Tue, 2024-08-13 at 11:56 -0700, Ankur Arora wrote: > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > Okanovic, Haris <harisokn@xxxxxxxxxx> writes: > > > On Mon, 2024-07-29 at 11:02 -0700, Ankur Arora wrote: > > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > > > > > > > > > Okanovic, Haris <harisokn@xxxxxxxxxx> writes: > > > > > > > On Fri, 2024-07-26 at 13:21 -0700, Ankur Arora wrote: > > > > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > > > > > > > > > > > > > > > > > > > Add architectural support for cpuidle-haltpoll driver by defining > > > > > arch_haltpoll_*(). > > > > > > > > > > Also define ARCH_CPUIDLE_HALTPOLL to allow cpuidle-haltpoll to be > > > > > selected, and given that we have an optimized polling mechanism > > > > > in smp_cond_load*(), select ARCH_HAS_OPTIMIZED_POLL. > > > > > > > > > > smp_cond_load*() are implemented via LDXR, WFE, with LDXR loading > > > > > a memory region in exclusive state and the WFE waiting for any > > > > > stores to it. > > > > > > > > > > In the edge case -- no CPU stores to the waited region and there's no > > > > > interrupt -- the event-stream will provide the terminating condition > > > > > ensuring we don't wait forever, but because the event-stream runs at > > > > > a fixed frequency (configured at 10kHz) we might spend more time in > > > > > the polling stage than specified by cpuidle_poll_time(). > > > > > > > > > > This would only happen in the last iteration, since overshooting the > > > > > poll_limit means the governor moves out of the polling stage. > > > > > > > > > > Signed-off-by: Ankur Arora <ankur.a.arora@xxxxxxxxxx> > > > > > --- > > > > > arch/arm64/Kconfig | 10 ++++++++++ > > > > > arch/arm64/include/asm/cpuidle_haltpoll.h | 9 +++++++++ > > > > > arch/arm64/kernel/cpuidle.c | 23 +++++++++++++++++++++++ > > > > > 3 files changed, 42 insertions(+) > > > > > create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h > > > > > > > > > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > > > > > index 5d91259ee7b5..cf1c6681eb0a 100644 > > > > > --- a/arch/arm64/Kconfig > > > > > +++ b/arch/arm64/Kconfig > > > > > @@ -35,6 +35,7 @@ config ARM64 > > > > > select ARCH_HAS_MEMBARRIER_SYNC_CORE > > > > > select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS > > > > > select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE > > > > > + select ARCH_HAS_OPTIMIZED_POLL > > > > > select ARCH_HAS_PTE_DEVMAP > > > > > select ARCH_HAS_PTE_SPECIAL > > > > > select ARCH_HAS_HW_PTE_YOUNG > > > > > @@ -2376,6 +2377,15 @@ config ARCH_HIBERNATION_HEADER > > > > > config ARCH_SUSPEND_POSSIBLE > > > > > def_bool y > > > > > > > > > > +config ARCH_CPUIDLE_HALTPOLL > > > > > + bool "Enable selection of the cpuidle-haltpoll driver" > > > > > + default n > > > > > + help > > > > > + cpuidle-haltpoll allows for adaptive polling based on > > > > > + current load before entering the idle state. > > > > > + > > > > > + Some virtualized workloads benefit from using it. > > > > > + > > > > > endmenu # "Power management options" > > > > > > > > > > menu "CPU Power Management" > > > > > diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h > > > > > new file mode 100644 > > > > > index 000000000000..65f289407a6c > > > > > --- /dev/null > > > > > +++ b/arch/arm64/include/asm/cpuidle_haltpoll.h > > > > > @@ -0,0 +1,9 @@ > > > > > +/* SPDX-License-Identifier: GPL-2.0 */ > > > > > +#ifndef _ARCH_HALTPOLL_H > > > > > +#define _ARCH_HALTPOLL_H > > > > > + > > > > > +static inline void arch_haltpoll_enable(unsigned int cpu) { } > > > > > +static inline void arch_haltpoll_disable(unsigned int cpu) { } > > > > > + > > > > > +bool arch_haltpoll_want(bool force); > > > > > +#endif > > > > > diff --git a/arch/arm64/kernel/cpuidle.c b/arch/arm64/kernel/cpuidle.c > > > > > index f372295207fb..334df82a0eac 100644 > > > > > --- a/arch/arm64/kernel/cpuidle.c > > > > > +++ b/arch/arm64/kernel/cpuidle.c > > > > > @@ -72,3 +72,26 @@ __cpuidle int acpi_processor_ffh_lpi_enter(struct acpi_lpi_state *lpi) > > > > > lpi->index, state); > > > > > } > > > > > #endif > > > > > + > > > > > +#if IS_ENABLED(CONFIG_HALTPOLL_CPUIDLE) > > > > > + > > > > > +#include <asm/cpuidle_haltpoll.h> > > > > > + > > > > > +bool arch_haltpoll_want(bool force) > > > > > +{ > > > > > + /* > > > > > + * Enabling haltpoll requires two things: > > > > > + * > > > > > + * - Event stream support to provide a terminating condition to the > > > > > + * WFE in the poll loop. > > > > > + * > > > > > + * - KVM support for arch_haltpoll_enable(), arch_haltpoll_enable(). > > > > > > > > typo: "arch_haltpoll_enable" and "arch_haltpoll_enable" > > > > > > > > > + * > > > > > + * Given that the second is missing, allow haltpoll to only be force > > > > > + * loaded. > > > > > + */ > > > > > + return (arch_timer_evtstrm_available() && false) || force; > > > > > > > > This should always evaluate false without force. Perhaps you meant > > > > something like this? > > > > > > > > ``` > > > > - return (arch_timer_evtstrm_available() && false) || force; > > > > + return arch_timer_evtstrm_available() || force; > > > > ``` > > > > > > No. This was intentional. As I meniton in the comment above, right now > > > the KVM support is missing. Which means that the guest has no way to > > > tell the host to not poll as part of host haltpoll. > > > > > > Until that is available, only allow force loading. > > > > I see, arm64's kvm is missing the poll control mechanism. > > > > I'll follow-up your changes with a patch for AWS Graviton; still seeing > > the same performance gains. > > Excellent. Could you Cc me when you send out your changes? Will do -- Haris Okanovic > > > Tested-by: Haris Okanovic <harisokn@xxxxxxxxxx> > > Thanks! > > -- > ankur