The VMD endpoint acts as a host bridge for a nested PCIe segment. This segment and its ports and endpoints are not represented in ACPI tables such as DSDT and HEST, so "firmware first" error handling is currently unsupported in the driver in favor of OS native error handling. VMD does support firmware error handling where errors are signaled to host firmware via SMI. VMD supports a "combined model" [1] where OS-native error handling is used in addition to firmware. Because of this combined model, this patch does not subscribe to the kernel firmware-first architecture which would prevent OS native error handling. The VMD segment does not support NMI callbacks from firmware to the OS. After error handling, firmware instead causes VMD to issue an interrupt via MSI vector 0, which is the same interrupt already used to signal PCIe errors for OS native handling in the VMD PCIe segment. In order to generate SMI on PCIe errors, BIOS is responsible for setting the System Error bits pre-boot in the Root Control register for the root ports in the VMD segment. BIOS signals its intent to use firmware first in the VMD segment by setting the Programming Interface to 0x1 in VMD's Class Code register. This patch checks for this value, and if set, prevents the native AER driver from clearing the System Error bits for root ports in the VMD segment. [1] https://firmware.intel.com/sites/default/files/resources/A_Tour_beyond_BIOS_Implementing_APEI_with_UEFI_White_Paper.pdf p. 10. Signed-off-by: Jon Derrick <jonathan.derrick@xxxxxxxxx> --- drivers/pci/controller/vmd.c | 35 ++++++++++++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c index 46ed80f..7269081 100644 --- a/drivers/pci/controller/vmd.c +++ b/drivers/pci/controller/vmd.c @@ -589,6 +589,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features) LIST_HEAD(resources); resource_size_t offset[2] = {0}; resource_size_t membar2_offset = 0x2000, busn_start = 0; + u8 interface; /* * Shadow registers may exist in certain VMD device ids which allow @@ -716,8 +717,40 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features) vmd_attach_resources(vmd); vmd_setup_dma_ops(vmd); dev_set_msi_domain(&vmd->bus->dev, vmd->irq_domain); - pci_rescan_bus(vmd->bus); + /* + * Certain VMD devices may request firmware-first error handling + * support on the VMD domain. These domains are not described by ACPI + * tables such as DSDT and HEST, so "firmware first" error handling is + * currently unsupported in the driver in favor of OS native error + * handing. + * + * VMD does support firmware error handling where errors are signaled + * to firmware via SMI. It also supports a "combined model" where the + * OS native error handling may be used in addition to firmware error + * handling. + * + * Because of the lack of ACPI support on the domain and the capability + * to use the 'combined model', the typical firmware-first architecture + * is not used because it would disable OS native error handling. + * + * VMD domains do not support NMI callbacks to the OS. Pass-back to the + * kernel from firmware is handled with a synthesized MSI on VMD device + * vector 0, which is the same interrupt already used to signal PCIe + * errors for OS native handling in the VMD domain and will trigger any + * remaining required OS native error handling. + * + * Because the VMD domain is not described by ACPI, the intent to use + * firmware-first error handling in the root ports is instead described + * by the VMD device's programming interface bit being set to 0x1. + */ + pci_read_config_byte(vmd->dev, PCI_CLASS_PROG, &interface); + if (interface == 0x1) { + struct pci_host_bridge *host = pci_find_host_bridge(vmd->bus); + host->no_disable_sys_err = 1; + } + + pci_rescan_bus(vmd->bus); WARN(sysfs_create_link(&vmd->dev->dev.kobj, &vmd->bus->dev.kobj, "domain"), "Can't create symlink to domain\n"); return 0; -- 1.8.3.1