Hi all, On 4/17/24 10:40 PM, Bjorn Helgaas wrote: > From: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > > Arul, Mateusz, Imcarneiro91, and Aman reported a regression caused by > 07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map"). On the > Lenovo Legion 9i laptop, that commit removes the area containing ECAM from > E820, which means the early E820 validation started failing, which meant we > didn't enable ECAM in the "early MCFG" path > > The lack of ECAM caused many ACPI methods to fail, resulting in the > embedded controller, PS/2, audio, trackpad, and battery devices not being > detected. The _OSC method also failed, so Linux could not take control of > the PCIe hotplug, PME, and AER features: > > # pci_mmcfg_early_init() > > PCI: ECAM [mem 0xc0000000-0xce0fffff] (base 0xc0000000) for domain 0000 [bus 00-e0] > PCI: not using ECAM ([mem 0xc0000000-0xce0fffff] not reserved) > > ACPI Error: AE_ERROR, Returned by Handler for [PCI_Config] (20230628/evregion-300) > ACPI: Interpreter enabled > ACPI: Ignoring error and continuing table load > ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.RP01._SB.PC00], AE_NOT_FOUND (20230628/dswload2-162) > ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220) > ACPI: Skipping parse of AML opcode: OpcodeName unavailable (0x0010) > ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.RP01._SB.PC00], AE_NOT_FOUND (20230628/dswload2-162) > ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220) > ... > ACPI Error: Aborting method \_SB.PC00._OSC due to previous error (AE_NOT_FOUND) (20230628/psparse-529) > acpi PNP0A08:00: _OSC: platform retains control of PCIe features (AE_NOT_FOUND) > > # pci_mmcfg_late_init() > > PCI: ECAM [mem 0xc0000000-0xce0fffff] (base 0xc0000000) for domain 0000 [bus 00-e0] > PCI: [Firmware Info]: ECAM [mem 0xc0000000-0xce0fffff] not reserved in ACPI motherboard resources > PCI: ECAM [mem 0xc0000000-0xce0fffff] is EfiMemoryMappedIO; assuming valid > PCI: ECAM [mem 0xc0000000-0xce0fffff] reserved to work around lack of ACPI motherboard _CRS > > Per PCI Firmware r3.3, sec 4.1.2, ECAM space must be reserved by a PNP0C02 > resource, but it need not be mentioned in E820, so we shouldn't look at > E820 to validate the ECAM space described by MCFG. > > 946f2ee5c731 ("[PATCH] i386/x86-64: Check that MCFG points to an e820 > reserved area") added a sanity check of E820 to work around buggy MCFG > tables, but that over-aggressive validation causes failures like this one. > > Keep the E820 validation check only for older BIOSes (pre-2016) so the > buggy 2006-era machines don't break. Skip the early E820 check for 2016 > and newer BIOSes. I know a fix for this has been long in the making so I don't want to throw a spanner into the works, but I wonder why is the is_efi_mmio() check inside the if (!early && !acpi_disabled) {} block (before this patch) ? is_efi_mmio() only relies on EFI memdescriptors and those are setup pretty early. Assuming that the EFI memdescriptors are indeed setup before pci_mmcfg_reserved(..., ..., early=true) gets called we could simply move the is_efi_mmio(&cfg->res) outside (below) the if (!early && !acpi_disabled) {} so that it always runs before the is_mmconf_reserved(e820__mapped_all, ...) check. Looking at the dmesg above the is_efi_mmio() check does succeed, so this should fix the issue without needing a BIOS year check ? Regards, Hans > Fixes: 07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map") > Reported-by: Mateusz Kaduk <mateusz.kaduk@xxxxxxxxx> > Reported-by: Arul <...> > Reported-by: Imcarneiro91 <...> > Reported-by: Aman <...> > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218444 > Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > Tested-by: Mateusz Kaduk <mateusz.kaduk@xxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > --- > arch/x86/pci/mmconfig-shared.c | 35 +++++++++++++++++++++++++++------- > 1 file changed, 28 insertions(+), 7 deletions(-) > > diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c > index 0cc9520666ef..53c7afa606c3 100644 > --- a/arch/x86/pci/mmconfig-shared.c > +++ b/arch/x86/pci/mmconfig-shared.c > @@ -518,7 +518,34 @@ static bool __ref pci_mmcfg_reserved(struct device *dev, > { > struct resource *conflict; > > - if (!early && !acpi_disabled) { > + if (early) { > + > + /* > + * Don't try to do this check unless configuration type 1 > + * is available. How about type 2? > + */ > + > + /* > + * 946f2ee5c731 ("Check that MCFG points to an e820 > + * reserved area") added this E820 check in 2006 to work > + * around BIOS defects. > + * > + * Per PCI Firmware r3.3, sec 4.1.2, ECAM space must be > + * reserved by a PNP0C02 resource, but it need not be > + * mentioned in E820. Before the ACPI interpreter is > + * available, we can't check for PNP0C02 resources, so > + * there's no reliable way to verify the region in this > + * early check. Keep it only for the old machines that > + * motivated 946f2ee5c731. > + */ > + if (dmi_get_bios_year() < 2016 && raw_pci_ops) > + return is_mmconf_reserved(e820__mapped_all, cfg, dev, > + "E820 entry"); > + > + return true; > + } > + > + if (!acpi_disabled) { > if (is_mmconf_reserved(is_acpi_reserved, cfg, dev, > "ACPI motherboard resource")) > return true; > @@ -554,12 +581,6 @@ static bool __ref pci_mmcfg_reserved(struct device *dev, > if (pci_mmcfg_running_state) > return true; > > - /* Don't try to do this check unless configuration > - type 1 is available. how about type 2 ?*/ > - if (raw_pci_ops) > - return is_mmconf_reserved(e820__mapped_all, cfg, dev, > - "E820 entry"); > - > return false; > } >