Relying on_OSC to be accurate about CPPC v2 support breaks scheduling on heterogenous-core Intel systems with buggy firmware

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



My name is Aaron Rainbolt, and I am working as a developer with Kubuntu Focus.

In commit 7feec7430eddd, the `acpi_cppc_processor_probe()` function was modified to check the CPPC v2 bit in _OSC to determine is CPPC v2 support was present on the system. If this bit is not set, a particular set of CPUs are checked using `cpc_supported_by_cpu()` (defined in arch/x86/kernel/acpi/cppc.c) to see if the processor supports CPPC v2 even though the BIOS does not report it. If this function returns false, CPPC v2 is considered absent.

While this works well on systems where the firmware accurately reports CPPC v2 support in _OSC, this causes a severe performance regression when using the new EEVDF scheduler on some machines. So far we've noted this issue on certain machines with i5-13500H processors, and have seen some reports of the same issue elsewhere on other hardware. All machines encountering this issue had two things in common:

* They use heterogenous-core Intel processors
* They have buggy or misconfigured firmware. In the clearest cases, this firmware fails to report CPPC v2 support in _OSC even though CPPC v2 works.

When these two things are true, the EEVDF scheduler will oftentimes schedule processes on efficiency cores rather than performance cores, resulting in badly impaired single-core performance (my workplace was seeing 50% slower Geekbench 5 scores on some systems because of this bug). Some examples of the bug online can be seen here:

* Kernel.org Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=218195
* Same issue, same author, Star Labs Firmware bug tracker: https://github.com/StarLabsLtd/firmware/issues/143 * Similar but less-clear issue on the Manjaro forums: https://forum.manjaro.org/t/linux-kernel-6-6-lts-cpu-regression-on-i7-alderlake/157474 * Similar but less-clear issue on the Gentoo forums: https://forums.gentoo.org/viewtopic-p-8819389.html?sid=5997f89fd5a202b6db8396fba0b45821 (resolved by enabling Intel SpeedStep - I suspect the poster meant Intel SpeedShift here, though I can't be certain)

To test whether the _OSC mis-reporting CPPC v2 support was the issue, I recompiled the latest kernel for Ubuntu 24.04 with the following test patch:

--- cppc_acpi_old.c     2024-06-16 15:27:44.214202299 -0500
+++ cppc_acpi.c 2024-06-16 00:29:51.684020493 -0500
@@ -679,8 +679,13 @@

       if (!osc_sb_cppc2_support_acked) {
               pr_debug("CPPC v2 _OSC not acked\n");
+               /* KFOCUS TEST PATCH
+                * Some machines have a BIOS bug that causes
+                * this code path to be mistakenly hit. Ignore
+                * it and continue regardless.
               if (!cpc_supported_by_cpu())
                       return -ENODEV;
+               */
       }

       /* Parse the ACPI _CPC table for this CPU. */

This essentially ignores the results of the _OSC bit check and continues on to parsing the ACPI table regardless. This immediately resolves the problem in our testing - CPPC v2 appears enabled looking under /sys and /proc, and single-core performance improves dramatically.

Looking through the mailing list archives, it does not appear simply ignoring this bit is safe in the long run - apparently is can mess something up with USB4? (See https://marc.info/?l=linux-acpi&m=165704566017713&w=2 - I've CC'd Mario Limonciello on this.)

Some ideas I have for potential long-term fixes:

* Perhaps add a kernel parameter such as "force_cppc_v2" that will allow the user to choose whether to ignore this check or not? This isn't ideal, but it would work, I think. * The `cpc_supported_by_cpu()` function appears to be used to work around this very bug for select AMD and Hygon CPUs. Would it be possible to add heterogenous-core Intel CPUs to this function so that the _OSC CPPC v2 bit is overridden for all such processors?
* (Long shot) Make the new scheduler not need CPPC v2?

While not ideal, I think the kernel parameter solution is the safest, and it is also sufficient for Kubuntu Focus's purposes. I'll work on a patch that uses that strategy if no one objects or has better suggestions.

Thanks for your help!





[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux