> Subject: Re: [PATCH 0/9] Enable haltpoll for arm64 A correction: please read the subject for the series as [PATCH v5] ... Missed the version number it while sending out. Thanks Ankur Ankur Arora <ankur.a.arora@xxxxxxxxxx> writes: > This patchset enables the cpuidle-haltpoll driver and its namesake > governor on arm64. This is specifically interesting for KVM guests by > reducing the IPC latencies. > > Comparing idle switching latencies on an arm64 KVM guest with > perf bench sched pipe: > > usecs/op %stdev > > no haltpoll (baseline) 13.48 +- 5.19% > with haltpoll 6.84 +- 22.07% > > > No change in performance for a similar test on x86: > > usecs/op %stdev > > haltpoll w/ cpu_relax() (baseline) 4.75 +- 1.76% > haltpoll w/ smp_cond_load_relaxed() 4.78 +- 2.31% > > Both sets of tests were on otherwise idle systems with guest VCPUs > pinned to specific PCPUs. One reason for the higher stdev on arm64 > is that trapping of the WFE instruction by the host KVM is contingent > on the number of tasks on the runqueue. > > > The patch series is organized in four parts: > - patches 1, 2 mangle the config option ARCH_HAS_CPU_RELAX, renaming > and moving it from x86 to common architectural code. > - next, patches 3-5, reorganize the haltpoll selection and init logic > to allow architecture code to select it. > - patch 6, reorganizes the poll_idle() loop, switching from using > cpu_relax() directly to smp_cond_load_relaxed(). > - and finally, patches 7-9, add the bits for arm64 support. > > What is still missing: this series largely completes the haltpoll side > of functionality for arm64. There are, however, a few related areas > that still need to be threshed out: > > - WFET support: WFE on arm64 does not guarantee that poll_idle() > would terminate in halt_poll_ns. Using WFET would address this. > - KVM_NO_POLL support on arm64 > - KVM TWED support on arm64: allow the host to limit time spent in > WFE. > > > Changelog: > > v5: > - rework the poll_idle() loop around smp_cond_load_relaxed() (review > comment from Tomohiro Misono.) > - also rework selection of cpuidle-haltpoll. Now selected based > on the architectural selection of ARCH_CPUIDLE_HALTPOLL. > - arch_haltpoll_supported() (renamed from arch_haltpoll_want()) on > arm64 now depends on the event-stream being enabled. > - limit POLL_IDLE_RELAX_COUNT on arm64 (review comment from Haris Okanovic) > - ARCH_HAS_CPU_RELAX is now renamed to ARCH_HAS_OPTIMIZED_POLL. > > v4 changes from v3: > - change 7/8 per Rafael input: drop the parens and use ret for the final check > - add 8/8 which renames the guard for building poll_state > > v3 changes from v2: > - fix 1/7 per Petr Mladek - remove ARCH_HAS_CPU_RELAX from arch/x86/Kconfig > - add Ack-by from Rafael Wysocki on 2/7 > > v2 changes from v1: > - added patch 7 where we change cpu_relax with smp_cond_load_relaxed per PeterZ > (this improves by 50% at least the CPU cycles consumed in the tests above: > 10,716,881,137 now vs 14,503,014,257 before) > - removed the ifdef from patch 1 per RafaelW > > Ankur Arora (4): > cpuidle: rename ARCH_HAS_CPU_RELAX to ARCH_HAS_OPTIMIZED_POLL > cpuidle-haltpoll: condition on ARCH_CPUIDLE_HALTPOLL > arm64: support cpuidle-haltpoll > cpuidle/poll_state: limit POLL_IDLE_RELAX_COUNT on arm64 > > Joao Martins (4): > Kconfig: move ARCH_HAS_OPTIMIZED_POLL to arch/Kconfig > cpuidle-haltpoll: define arch_haltpoll_supported() > governors/haltpoll: drop kvm_para_available() check > arm64: define TIF_POLLING_NRFLAG > > Mihai Carabas (1): > cpuidle/poll_state: poll via smp_cond_load_relaxed() > > arch/Kconfig | 3 +++ > arch/arm64/Kconfig | 10 ++++++++++ > arch/arm64/include/asm/cpuidle_haltpoll.h | 21 +++++++++++++++++++++ > arch/arm64/include/asm/thread_info.h | 2 ++ > arch/x86/Kconfig | 4 +--- > arch/x86/include/asm/cpuidle_haltpoll.h | 1 + > arch/x86/kernel/kvm.c | 10 ++++++++++ > drivers/acpi/processor_idle.c | 4 ++-- > drivers/cpuidle/Kconfig | 5 ++--- > drivers/cpuidle/Makefile | 2 +- > drivers/cpuidle/cpuidle-haltpoll.c | 9 ++------- > drivers/cpuidle/governors/haltpoll.c | 6 +----- > drivers/cpuidle/poll_state.c | 21 ++++++++++++++++----- > drivers/idle/Kconfig | 1 + > include/linux/cpuidle.h | 2 +- > include/linux/cpuidle_haltpoll.h | 5 +++++ > 16 files changed, 79 insertions(+), 27 deletions(-) > create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h -- ankur