Hi Lorenzo, On Wed, 2018-10-17 at 15:12 +0100, Lorenzo Pieralisi wrote: > On Mon, Oct 15, 2018 at 06:48:08PM -0600, Jon Derrick wrote: > > The VMD endpoint device exposes a domain of root ports and any > > downstream devices attached to that hierarchy. VMD domains > > consisting of > > the root ports and downstream devices are not represented in the > > ACPI > > tables and _OSC is unsupported. Because of this non-standard way of > > signaling firmware-first enabling on the root ports, the VMD device > > instead advertises support for firmware-first on the root ports by > > setting its interface bit to 0x1. > > > > When firmware-first is enabled on a VMD domain, the driver sets up > > the > > root port control registers to generate SMI system interrupts to > > firmware on errors. System firmware will handle the error as it > > sees > > fit, then passes back control to VMD with a synthesized MSI > > message. > > > > Because of this kernel pass-back, the driver does not disable the > > native > > AER port service driver attached to the VMD root ports, allowing > > for > > further kernel error handling if desired. > > This patch looks more like a policy to detect whether the generic > Root Port system error reporting should be enabled or not, or at > least may be generalized as such. > It is contained in the VMD driver - I would like to have Keith and > Bjorn > opinions before merging it, see below. It is inherently a policy decision by the BIOS. > > > Signed-off-by: Jon Derrick <jonathan.derrick@xxxxxxxxx> > > --- > > drivers/pci/controller/vmd.c | 30 ++++++++++++++++++++++++++++++ > > 1 file changed, 30 insertions(+) > > > > diff --git a/drivers/pci/controller/vmd.c > > b/drivers/pci/controller/vmd.c > > index 46ed80f..9625dca 100644 > > --- a/drivers/pci/controller/vmd.c > > +++ b/drivers/pci/controller/vmd.c > > @@ -589,6 +589,7 @@ static int vmd_enable_domain(struct vmd_dev > > *vmd, unsigned long features) > > LIST_HEAD(resources); > > resource_size_t offset[2] = {0}; > > resource_size_t membar2_offset = 0x2000, busn_start = 0; > > + u8 interface; > > > > /* > > * Shadow registers may exist in certain VMD device ids > > which allow > > @@ -718,6 +719,35 @@ static int vmd_enable_domain(struct vmd_dev > > *vmd, unsigned long features) > > dev_set_msi_domain(&vmd->bus->dev, vmd->irq_domain); > > pci_rescan_bus(vmd->bus); > > > > + /* > > + * Certain VMD devices may request firmware-first error > > handling > > + * support on the domain. These domains are virtual and > > not described > > + * by ACPI and must be configured manually. VMD domains > > which utilize > > + * firmware-first may still require further kernel error > > handling, but > > + * the domain is intended to first interrupt upon error to > > system > > + * firmware before being passed back to the kernel. The > > system error > > + * handling bits in the root port control register must be > > enabled > > + * following the AER service driver configuration in order > > to generate > > + * these system interrupts. > > + * > > + * Because the root ports are not described by ACPI and > > _OSC is > > + * unsupported in VMD domains, the intent to use firmware- > > first error > > + * handling in the root ports is instead described by the > > VMD device's > > + * interface bit. > > + */ > > + pci_read_config_byte(vmd->dev, PCI_CLASS_PROG, > > &interface); > > + if (interface == 0x1) { > > + struct pci_dev *rpdev; > > + > > + list_for_each_entry(rpdev, &vmd->bus->devices, > > bus_list) { > > + if (rpdev->aer_cap) > > This should be CONFIG_PCIEAER guarded but I would like to understand > its > logic. I think we should consider allowing it for CONFIG_PCIEAER=n. The firmware should be able to try and manage the error even if it occurs when native aer isn't built-in. It would be worse in that respect if the error went left unchecked. > IIUC this assumes all devices in the root bus are root ports, which > is > an VMD assumption I reckon. That's true for the VMD hardware so far. We don't expect any devices besides root ports on the root bus. I would rather drop the dev->aer_cap check, but I can add a pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT check > > > > + pcie_capability_set_word(rpdev, > > PCI_EXP_RTCTL, > > + PCI_EXP_R > > TCTL_SECEE | > > + PCI_EXP_R > > TCTL_SENFEE | > > + PCI_EXP_R > > TCTL_SEFEE); > > I wonder whether this code should be part of the generic AER layer > (ie > the PCIE port driver) rather than VMD specific, after all that's part > of > the generic specifications, I do not know if we can envisage an API > that > allow PCI controller drivers to enable/disable system errors. > > Thoughts ? > > Lorenzo I did consider this option first. There's a call in aer.c to disable these bits, so the API need has precendence. We would need this API reachable by vmd.c. It seems easiest to inline in pci.h. > > > + } > > + } > > + > > WARN(sysfs_create_link(&vmd->dev->dev.kobj, &vmd->bus- > > >dev.kobj, > > "domain"), "Can't create symlink to > > domain\n"); > > return 0; > > -- > > 1.8.3.1 > >