On Thu, 18 Jun 2020 11:20:04 -0500 Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > On Thu, Jun 18, 2020 at 09:48:29AM +0100, Jonathan Cameron wrote: > > On Wed, 17 Jun 2020 11:25:55 -0700 > > "Kuppuswamy, Sathyanarayanan" > > <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> wrote: > > > On 6/17/20 10:36 AM, Sean V Kelley wrote: > > > > On Tue, 2020-06-16 at 14:24 -0500, Bjorn Helgaas wrote: > > > >> On Fri, May 22, 2020 at 01:31:34AM +0800, Jonathan Cameron > > > >> wrote: > > > > IIUC, we are trying to solve multiple issues here. > > > > > > 1. Error detection and recovery support for RCiEPs and RCEC. > > > 2. Firmware first exception for case 1. > > > 3. AEPI based handling for case 1 (I think this is the case > > > Jonathan is trying to handle) > > > > I'm not sure it separates that cleanly but the flow I'm interested > > in is firmware first with errors reported using APEI / GHESv2 etc. > > > > In particular without an RCEC, as it should (I think) play no part > > in that path. One of the main aims of me bringing this forwards at > > this stage is to establish whether I need to get our hardware teams > > to put an RCEC in place for future hardware or not. Right now we > > have some work arounds in place as we can reroute some of these > > errors directly to device interrupts. > > > > We haven't been able to come up with a reason why we need an RCEC > > given our approach to error handling. > > If you only want to use ACPI APEI and you never need native AER > support, you may not need an RCEC. But note that an RCEC *is* > required if you want PME messages or Function Readiness Status (FRS) > for RCiEPs. Understood. > > > > >>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c > > > >>> index 14bb8f54723e..d34be4483f73 100644 > > > >>> --- a/drivers/pci/pcie/err.c > > > >>> +++ b/drivers/pci/pcie/err.c > > > >>> @@ -153,6 +153,67 @@ pci_ers_result_t pcie_do_recovery(struct > > > >>> pci_dev *dev, > > > >>> pci_ers_result_t status = > > > >>> PCI_ERS_RESULT_CAN_RECOVER; struct pci_bus *bus; > > > I am curious what bus (dev->subordinate) does RCEC and RCiEP > > > belongs to ? > > > > I'm not quite sure what you are asking so... > > > > They are effectively the same as root ports, and so sit on a bus > > specified via ACPI. In our case IORT. The root complex includes a > > bunch of bus numbers on which these devices can be discovered. > > 0, 74, 78, 7a, 7b, 7c, 80, b4, ba, bb, b8, bc on the 2 socket > > machine I have to hand. Those buses have a mix of root ports, and > > RCiEPs on them. > > I don't know much about IORT, but it's not the way we discover these > devices. > > RCiEPs and RCECs should be discovered the same way as Root Ports: a > PNP0A03 device in the ACPI namespace describes a PCI host bridge, and > the _CRS method of that PNP0A03 device contains a bus number range. > The base of that range is the root "bus" on which Root Ports, RCiEPs, > RCECs, and possibly some Conventional PCI devices exist. > > The machine Jonathan mentioned above should have 12 PNP0A03 devices, > one for each of the root buses he listed. You are of course correct. That particular bit of info is available in both places, but enumeration comes the DSDT entries you mention. > > > As to subordinate (i.e. where they bridge to) I don't think it > > would have any meaning. For reference I checked one of our RCiEPs > > and it's set to 0 on the config space. > > RCiEPs and RCECs are type 0 devices, so they're not bridges. They > have no secondary bus. Thanks