Re: [PATCH] x86/pci: Stop requiring MMCONFIG to be declared in E820, ACPI or EFI for newer systems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/5/2023 10:17, Bjorn Helgaas wrote:
On Tue, Dec 05, 2023 at 09:48:45AM -0600, Mario Limonciello wrote:
commit 7752d5cfe3d1 ("x86: validate against acpi motherboard resources")
introduced checks for ensuring that MCFG table also has memory region
reservations to ensure no conflicts were introduced from a buggy BIOS.

This has proceeded over time to add other types of reservation checks
for ACPI PNP resources and EFI MMIO memory type.  The PCI firmware spec
however says that these checks are only required when the operating system
doesn't comprehend the firmware region:

```
If the operating system does not natively comprehend reserving the MMCFG
region, the MMCFG region must be reserved by firmware. The address range
reported in the MCFG table or by _CBA method (see Section 4.1.3) must be
reserved by declaring a motherboard resource. For most systems, the
motherboard resource would appear at the root of the ACPI namespace
(under \_SB) in a node with a _HID of EISAID (PNP0C02), and the resources
in this case should not be claimed in the root PCI bus’s _CRS. The
resources can optionally be returned in Int15 E820h or EFIGetMemoryMap
as reserved memory but must always be reported through ACPI as a
motherboard resource.
```

My understanding is that native comprehension would mean Linux knows
how to discover and/or configure the MMCFG base address and size in
the hardware and that Linux would then reserve that region so it's not
used for anything else.

Linux doesn't have that, at least for x86.  It relies on the MCFG
table to discover the MMCFG region, and it relies on PNP0C02 _CRS to
reserve it.

MCFG to discover it matches the PCI firmware spec, but as I point out above the decision to reserve this region doesn't require PNP0C01/PNP0C02 _CRS.

This is a decision made by Linux historically.


Running this check causes problems with accessing extended PCI
configuration space on OEM laptops that don't specify the region in PNP
resources or in the EFI memory map. That later manifests as problems with
dGPU and accessing resizable BAR.

Is there a problem report we can reference here?

Nothing public to share. AMD BIOS team is in discussion with the OEM to add the reservation in a BIOS upgrade so it works with things like the LTS kernels.

Knowing Windows works without it I feel this is still something that we should be looking at fixing from an upstream perspective though which is what prompted my patch and discussion.


Does the problem still occur with this series?
https://lore.kernel.org/r/20231121183643.249006-1-helgaas@xxxxxxxxxx

This appeared in linux-next 20231130.

Thanks for sharing that. If I do respin a variation of this patch I'll rebase on top of that.

I had a try with that series on top of 6.7-rc4, but it doesn't fix the issue (but obviously the patch I sent does).

# journalctl -k | grep ECAM
Dec 05 06:37:46 cl-fw-fedora kernel: PCI: ECAM [mem 0xe0000000-0xefffffff] (base 0xe0000000) for domain 0000 [bus 00-ff] Dec 05 06:37:46 cl-fw-fedora kernel: PCI: not using ECAM ([mem 0xe0000000-0xefffffff] not reserved) Dec 05 06:37:46 cl-fw-fedora kernel: PCI: ECAM [mem 0xe0000000-0xefffffff] (base 0xe0000000) for domain 0000 [bus 00-ff] Dec 05 06:37:46 cl-fw-fedora kernel: PCI: [Firmware Info]: ECAM [mem 0xe0000000-0xefffffff] not reserved in ACPI motherboard resources Dec 05 06:37:46 cl-fw-fedora kernel: PCI: not using ECAM ([mem 0xe0000000-0xefffffff] not reserved)


Similar problems don't exist in Windows 11 with exact same
laptop/firmware stack, and in discussion with AMD's BIOS team
Windows doesn't have similar checks.

I would love to know AMD BIOS team's take on this.  Does the BIOS
reserve the MMCFG space in any way?

On the AMD reference platform this OEM system is based on it is reserved in the EFI memory map. So on a 6.7 based kernel the reference system you can see this emitted:

PCI: MMCONFIG at [mem 0xe0000000-0xefffffff] reserved as EfiMemoryMappedIO

But on the OEM system this is not reserved by EFI memory map or _CRS.

That's why my assumption after reading the firmware spec and seeing the behavior is that Windows makes the reservation *based on* what's in MCFG.


As this series of checks was first introduced as a mitigation for buggy
BIOS before EFI was introduced add a BIOS date range to only enforce the
checks on hardware that predates the release of Windows 11.

Many of the MMCFG checks in Linux are historical artifacts that are
likely related to Linux defects, not BIOS defects, so I wouldn't
expect to see them in Windows.  But it's hard to remove them now.

I guess I was hoping that by cutting a line in the sand we could avoid breaking anything that was relying upon the older behavior.


Link: https://members.pcisig.com/wg/PCI-SIG/document/15350
       PCI Firmware Specification 3.3
       Section 4.1.2 MCFG Table Description Note 2
Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx>
---
  arch/x86/pci/mmconfig-shared.c | 10 +++++++---
  1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c
index 4b3efaa82ab7..e4594b181ebf 100644
--- a/arch/x86/pci/mmconfig-shared.c
+++ b/arch/x86/pci/mmconfig-shared.c
@@ -570,9 +570,13 @@ static void __init pci_mmcfg_reject_broken(int early)
list_for_each_entry(cfg, &pci_mmcfg_list, list) {
  		if (pci_mmcfg_check_reserved(NULL, cfg, early) == 0) {
-			pr_info(PREFIX "not using MMCONFIG\n");
-			free_all_mmcfg();
-			return;
+			if (dmi_get_bios_year() >= 2021) {
+				pr_info(PREFIX "MMCONFIG wasn't reserved by ACPI or EFI\n");

I think this leads to using the MMCONFIG area without reserving it
anywhere, so we may end up assigning that space to something else,
which won't work, i.e., the problem described here:
https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/commit/?id=5cef3014e02d

+			} else {
+				pr_info(PREFIX "not using MMCONFIG\n");
+				free_all_mmcfg();
+				return;
+			}
  		}
  	}
  }
--
2.34.1






[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux