Re: [PATCH] ACPI: processor_idle: Skip dummy wait for processors based on the Zen microarchitecture

K Prateek Nayak <kprateek.nayak@xxxxxxx> · Thu, 22 Sep 2022 11:14:45 +0530

Hello Dave,

On 9/21/2022 7:45 PM, Dave Hansen wrote:
> On 9/20/22 23:36, K Prateek Nayak wrote:
>> +	/*
>> +	 * No delay is needed if we are in guest or on a processor
>> +	 * based on the Zen microarchitecture.
>> +	 */
>> +	if (boot_cpu_has(X86_FEATURE_HYPERVISOR) || boot_cpu_has(X86_FEATURE_ZEN))
>>  		return;
> 
> In the end, the delay is because of buggy, circa 2006 chipsets?  So, we
> use a CPU vendor specific check to approximate that the chipset is
> recent and not affected by the bug?  If so, is there no better way to
> check for a newer chipset than this?

Elsewhere in the thread, people have noted that the faulty chipsets seem to
go all the way back to pre-2002. Andreas's comment was added in 2006 but we
have no way of knowing if it is limited only to chipsets prior to 2006. If
anyone can confirm a clean cut-off point when this was no longer required,
perhaps we can limit this dummy wait to the older chipsets by annotating
them with a X86_BUG_STPCLK quirk.

> 
> Do X86_FEATURE_ZEN CPUs just have unusually painful
> inl(acpi_fadt.xpm_tmr_blk.address) implementations?

Yes. The issue becomes more pronounced with increased core counts when many
cores exit from C2 simultaneously. The core density is especially high on
X86_FEATURE_ZEN chipsets, none of which require a dummy wait op to ensure
correct behavior. Hence, we used the feature check to skip it.

> Is that why we
> noticed all of a sudden?
> 

We saw run-to-run variance in tbench with 128 clients as a part of our
scheduler regression runs. Originally we attributed it to the tbench
threads not being spread around uniformly, but the problem persisted even
when we ensured that the initial task placement spreads them out. Further
analysis showed that significant time was spent in exit from C2 in the
bad runs.

--
Thanks and Regards,
Prateek