On 9/22/2022 8:47 PM, Dave Hansen wrote:
Old, circa 2002 chipsets have a bug: they don't go idle when they are supposed to. So, a workaround was added to slow the CPU down and ensure that the CPU waits a bit for the chipset to actually go idle. This workaround is ancient and has been in place in some form since the original kernel ACPI implementation. But, this workaround is very painful on modern systems. The "inl()" can take thousands of cycles (see Link: for some more detailed numbers and some fun kernel archaeology). First and foremost, modern systems should not be using this code. Typical Intel systems have not used it in over a decade because it is horribly inferior to MWAIT-based idle. Despite this, people do seem to be tripping over this workaround on AMD system today. Limit the "dummy wait" workaround to Intel systems. Keep Modern AMD systems from tripping over the workaround. Remotely modern Intel systems use intel_idle instead of this code and will, in practice, remain unaffected by the dummy wait. Signed-off-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> Cc: Len Brown <lenb@xxxxxxxxxx> Cc: Mario Limonciello <Mario.Limonciello@xxxxxxx> Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> Cc: Borislav Petkov <bp@xxxxxxxxx> Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> Reported-by: K Prateek Nayak <kprateek.nayak@xxxxxxx> Link: https://lore.kernel.org/all/20220921063638.2489-1-kprateek.nayak@xxxxxxx/
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> or do you want me to pick this up?
--- drivers/acpi/processor_idle.c | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index 16a1663d02d4..9f40917c49ef 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -531,10 +531,27 @@ static void wait_for_freeze(void) /* No delay is needed if we are in guest */ if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) return; + /* + * Modern (>=Nehalem) Intel systems use ACPI via intel_idle, + * not this code. Assume that any Intel systems using this + * are ancient and may need the dummy wait. This also assumes + * that the motivating chipset issue was Intel-only. + */ + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) + return; #endif - /* Dummy wait op - must do something useless after P_LVL2 read - because chipsets cannot guarantee that STPCLK# signal - gets asserted in time to freeze execution properly. */ + /* + * Dummy wait op - must do something useless after P_LVL2 read + * because chipsets cannot guarantee that STPCLK# signal gets + * asserted in time to freeze execution properly + * + * This workaround has been in place since the original ACPI + * implementation was merged, circa 2002. + * + * If a profile is pointing to this instruction, please first + * consider moving your system to a more modern idle + * mechanism. + */ inl(acpi_gbl_FADT.xpm_timer_block.address); }