[STABLE REGRESSION] Possible missing backport of x86_match_cpu() change in v6.1.96

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I upgraded from kernel 6.1.94 to 6.1.99 on one of my machines and noticed that
the dmesg line "Incomplete global flushes, disabling PCID" had disappeared from
the log.

That message comes from commit c26b9e193172f48cd0ccc64285337106fb8aa804, which
disables PCID support on some broken hardware in arch/x86/mm/init.c:

#define INTEL_MATCH(_model) { .vendor  = X86_VENDOR_INTEL,     \
                             .family  = 6,                     \
                             .model = _model,                  \
                           }
/*
 * INVLPG may not properly flush Global entries
 * on these CPUs when PCIDs are enabled.
 */
static const struct x86_cpu_id invlpg_miss_ids[] = {
       INTEL_MATCH(INTEL_FAM6_ALDERLAKE   ),
       INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L ),
       INTEL_MATCH(INTEL_FAM6_ALDERLAKE_N ),
       INTEL_MATCH(INTEL_FAM6_RAPTORLAKE  ),
       INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P),
       INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S),
       {}

...

if (x86_match_cpu(invlpg_miss_ids)) {
        pr_info("Incomplete global flushes, disabling PCID");
        setup_clear_cpu_cap(X86_FEATURE_PCID);
        return;
}

arch/x86/mm/init.c, which has that code, hasn't changed in 6.1.94 -> 6.1.99.
However I found a commit changing how x86_match_cpu() behaves in 6.1.96:

commit 8ab1361b2eae44077fef4adea16228d44ffb860c
Author: Tony Luck <tony.luck@xxxxxxxxx>
Date:   Mon May 20 15:45:33 2024 -0700

    x86/cpu: Fix x86_match_cpu() to match just X86_VENDOR_INTEL

I suspect this broke the PCID disabling code in arch/x86/mm/init.c.
The commit message says:

"Add a new flags field to struct x86_cpu_id that has a bit set to indicate that
this entry in the array is valid. Update X86_MATCH*() macros to set that bit.
Change the end-marker check in x86_match_cpu() to just check the flags field
for this bit."

But the PCID disabling code in 6.1.99 does not make use of the
X86_MATCH*() macros; instead, it defines a new INTEL_MATCH() macro without the
X86_CPU_ID_FLAG_ENTRY_VALID flag.

I looked in upstream git and found an existing fix:
commit 2eda374e883ad297bd9fe575a16c1dc850346075
Author: Tony Luck <tony.luck@xxxxxxxxx>
Date:   Wed Apr 24 11:15:18 2024 -0700

    x86/mm: Switch to new Intel CPU model defines

    New CPU #defines encode vendor and family as well as model.

    [ dhansen: vertically align 0's in invlpg_miss_ids[] ]

    Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
    Signed-off-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
    Signed-off-by: Borislav Petkov (AMD) <bp@xxxxxxxxx>
    Link: https://lore.kernel.org/all/20240424181518.41946-1-tony.luck%40intel.com

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 679893ea5e68..6b43b6480354 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -261,21 +261,17 @@ static void __init probe_page_size_mask(void)
        }
 }
-#define INTEL_MATCH(_model) { .vendor = X86_VENDOR_INTEL, \
-                             .family  = 6,                     \
-                             .model = _model,                  \
-                           }
 /*
  * INVLPG may not properly flush Global entries
  * on these CPUs when PCIDs are enabled.
  */
 static const struct x86_cpu_id invlpg_miss_ids[] = {
-       INTEL_MATCH(INTEL_FAM6_ALDERLAKE   ),
-       INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L ),
-       INTEL_MATCH(INTEL_FAM6_ATOM_GRACEMONT ),
-       INTEL_MATCH(INTEL_FAM6_RAPTORLAKE  ),
-       INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P),
-       INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S),
+       X86_MATCH_VFM(INTEL_ALDERLAKE,      0),
+       X86_MATCH_VFM(INTEL_ALDERLAKE_L,    0),
+       X86_MATCH_VFM(INTEL_ATOM_GRACEMONT, 0),
+       X86_MATCH_VFM(INTEL_RAPTORLAKE,     0),
+       X86_MATCH_VFM(INTEL_RAPTORLAKE_P,   0),
+       X86_MATCH_VFM(INTEL_RAPTORLAKE_S,   0),
        {}
 };

The fix removed the custom INTEL_MATCH macro and uses the X86_MATCH*() macros
with X86_CPU_ID_FLAG_ENTRY_VALID. This fixed commit was never backported to 6.1,
so it looks like a stable series regression due to a missing backport.

If I apply the fix patch on 6.1.99, the PCID disabling code activates again.
I had to change all the INTEL_* definitions to the old definitions to make it
build:

 static const struct x86_cpu_id invlpg_miss_ids[] = {
-       INTEL_MATCH(INTEL_FAM6_ALDERLAKE   ),
-       INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L ),
-       INTEL_MATCH(INTEL_FAM6_ALDERLAKE_N ),
-       INTEL_MATCH(INTEL_FAM6_RAPTORLAKE  ),
-       INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P),
-       INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S),
+       X86_MATCH_VFM(INTEL_FAM6_ALDERLAKE,    0),
+       X86_MATCH_VFM(INTEL_FAM6_ALDERLAKE_L,  0),
+       X86_MATCH_VFM(INTEL_FAM6_ALDERLAKE_N,  0),
+       X86_MATCH_VFM(INTEL_FAM6_RAPTORLAKE,   0),
+       X86_MATCH_VFM(INTEL_FAM6_RAPTORLAKE_P, 0),
+       X86_MATCH_VFM(INTEL_FAM6_RAPTORLAKE_S, 0),
        {}
 };

I only looked at the code in arch/x86/mm/init.c, so there may be other uses of
x86_match_cpu() in the kernel that are also broken in 6.1.99.
This email is meant as a bug report, not a pull request. Someone else should
confirm the problem and submit the appropriate fix.




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux