Hi Bjorn, On 5/6/22 18:51, Bjorn Helgaas wrote: > On Thu, May 05, 2022 at 05:20:16PM +0200, Hans de Goede wrote: >> Some BIOS-es contain bugs where they add addresses which are already >> used in some other manner to the PCI host bridge window returned by >> the ACPI _CRS method. To avoid this Linux by default excludes >> E820 reservations when allocating addresses since 2010, see: >> commit 4dc2287c1805 ("x86: avoid E820 regions when allocating address >> space"). >> >> Recently (2019) some systems have shown-up with E820 reservations which >> cover the entire _CRS returned PCI bridge memory window, causing all >> attempts to assign memory to PCI BARs which have not been setup by the >> BIOS to fail. For example here are the relevant dmesg bits from a >> Lenovo IdeaPad 3 15IIL 81WE: >> >> [mem 0x000000004bc50000-0x00000000cfffffff] reserved >> pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window] >> >> The ACPI specifications appear to allow this new behavior: >> >> The relationship between E820 and ACPI _CRS is not really very clear. >> ACPI v6.3, sec 15, table 15-374, says AddressRangeReserved means: >> >> This range of addresses is in use or reserved by the system and is >> not to be included in the allocatable memory pool of the operating >> system's memory manager. >> >> and it may be used when: >> >> The address range is in use by a memory-mapped system device. >> >> Furthermore, sec 15.2 says: >> >> Address ranges defined for baseboard memory-mapped I/O devices, such >> as APICs, are returned as reserved. >> >> A PCI host bridge qualifies as a baseboard memory-mapped I/O device, >> and its apertures are in use and certainly should not be included in >> the general allocatable pool, so the fact that some BIOS-es reports >> the PCI aperture as "reserved" in E820 doesn't seem like a BIOS bug. >> >> So it seems that the excluding of E820 reserved addresses is a mistake. >> >> Ideally Linux would fully stop excluding E820 reserved addresses, >> but then various old systems will regress. >> Instead keep the old behavior for old systems, while ignoring >> the E820 reservations for any systems from now on. >> >> Old systems are defined here as BIOS year < 2018, this was chosen to >> make sure that pci_use_e820 will not be set on the currently affected >> systems, the oldest known one is from 2019. >> >> Testing has shown that some newer systems also have a bad _CRS return. >> The pci_crs_quirks DMI table is used to keep excluding E820 reservations >> from the bridge window on these systems. >> >> Also add pci=no_e820 and pci=use_e820 options to allow overriding >> the BIOS year + DMI matching logic. >> >> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=206459 >> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1868899 >> BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1871793 >> BugLink: https://bugs.launchpad.net/bugs/1878279 >> BugLink: https://bugs.launchpad.net/bugs/1931715 >> BugLink: https://bugs.launchpad.net/bugs/1932069 >> BugLink: https://bugs.launchpad.net/bugs/1921649 >> Cc: Benoit Grégoire <benoitg@xxxxxxxx> >> Cc: Hui Wang <hui.wang@xxxxxxxxxxxxx> >> Signed-off-by: Hans de Goede <hdegoede@xxxxxxxxxx> > >> + * Ideally Linux would fully stop using E820 reservations, but then >> + * various old systems will regress. Instead keep the old behavior for >> + * old systems + known to be broken newer systems in pci_crs_quirks. >> + */ >> + if (year >= 0 && year < 2018) >> + pci_use_e820 = true; > > How did you pick 2018? Prior to this patch, we used E820 reservations > for all machines. This patch would change that for 2019-2022 > machines, so there's a risk of breaking some of them. Correct. I picked 2018 because the first devices where using E820 reservations are causing issues (i2c controller not getting resources leading to non working touchpad / thunderbolt hotplug issues) have BIOS dates starting in 2019. I added a year margin, so we could make this 2019. > I'm hesitant about changing the behavior for machines already in the > field because if they were tested at all with Linux, it was without > this patch. So I would lean toward preserving the current behavior > for BIOS year < 2023. I see, I presume the idea is to then use DMI to disable E820 clipping on current devices where this is known to cause problems ? So for v8 I would: 1. Change the cut-off check to < 2023 2. Drop the DMI quirks I added for models which are known to need E820 clipping hit by the < 2018 check 3. Add DMI quirks for models for which it is known that we must _not_ do E820 clipping Is this the direction you want to go / does that sound right? Note the DMI list for 3. will initially very likely be incomplete, but I can ask around for testing once we have settled on this approach and do one or more follow up patches to extend the list. >> diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c >> index 9e1e6b8d8876..7e6f79aab6a8 100644 >> --- a/arch/x86/pci/common.c >> +++ b/arch/x86/pci/common.c >> @@ -595,6 +595,12 @@ char *__init pcibios_setup(char *str) >> } else if (!strcmp(str, "nocrs")) { >> pci_probe |= PCI_ROOT_NO_CRS; >> return NULL; >> + } else if (!strcmp(str, "use_e820")) { >> + pci_probe |= PCI_USE_E820; > > I think we should add_taint(TAINT_FIRMWARE_WORKAROUND) for both these > cases. Ok, I'll add this for v8. > > We probably should do it for *all* the parameters here, but that would > be a separate discussion. > >> + return NULL; >> + } else if (!strcmp(str, "no_e820")) { >> + pci_probe |= PCI_NO_E820; >> + return NULL; >> #ifdef CONFIG_PHYS_ADDR_T_64BIT >> } else if (!strcmp(str, "big_root_window")) { >> pci_probe |= PCI_BIG_ROOT_WINDOW; >> -- >> 2.36.0 >> > Regards, Hans