On Tue, Sep 19, 2023 at 11:31:57AM +0800, Kai-Heng Feng wrote: > On Wed, Sep 13, 2023 at 8:50 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > [snipped] > > Hmm. In some ways the VMD device acts as a Root Port, since it > > originates a new hierarchy in a separate domain, but on the upstream > > side, it's just a normal endpoint. > > > > How does AER for the new hierarchy work? A device below the VMD can > > generate ERR_COR/ERR_NONFATAL/ERR_FATAL messages. I guess I was > > assuming those messages would terminate at the VMD, and the VMD could > > generate an AER interrupt just like a Root Port. But that can't be > > right because I don't think VMD would have the Root Error Command > > register needed to manage that interrupt. > > VMD itself doesn't seem to manage AER, the rootport that "moved" from > 0000 domain does: > [ 2113.507345] pcieport 10000:e0:06.0: AER: Corrected error received: > 10000:e1:00.0 > [ 2113.507380] nvme 10000:e1:00.0: PCIe Bus Error: severity=Corrected, > type=Physical Layer, (Receiver ID) > [ 2113.507389] nvme 10000:e1:00.0: device [144d:a80a] error > status/mask=00000001/0000e000 > [ 2113.507398] nvme 10000:e1:00.0: [ 0] RxErr (First) Oh, I forgot how VMD works. It sounds like there *is* a Root Port that is logically below the VMD, e.g., (from https://bugzilla.kernel.org/show_bug.cgi?id=215027): ACPI: PCI Root Bridge [PC00] (domain 0000 [bus 00-e0]) acpi PNP0A08:00: _OSC: platform does not support [AER] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR] pci 0000:00:0e.0: [8086:467f] type 00 # VMD vmd 0000:00:0e.0: PCI host bridge to bus 10000:e0 pci 10000:e0:06.0: [8086:464d] type 01 # Root Port to [bus e1] pci 10000:e1:00.0: [144d:a80a] type 00 # Samsung NVMe So ERR_* messages from the e1:00.0 Samsung device would terminate at the e0:06.0 Root Port. That Root Port has an AER Capability with Root Error Command/Status/Error Source registers. > > But if VMD just passes those messages up to the Root Port, the source > > of the messages (the Requester ID) won't make any sense because > > they're in a hierarchy the Root Port doesn't know anything about. > > Not sure what's current status is but I think Nirmal's patch is valid > for both our cases. So I think the question is whether that PNP0A08:00 _OSC applies to domain 10000. I think the answer is "no" because the platform doesn't know about the existence of domain 10000, and it can't access config space in that domain. E.g., if _OSC negotiated that the platform owned AER in domain 0000, I don't think it would make sense for that to mean the platform *also* owned AER in domain 10000, because the platform doesn't know how to configure AER or handle AER interrupts in that domain. Nirmal's patch ignores _OSC for hotplug, but keeps the _OSC results for AER, PME, and LTR. I think we should ignore _OSC for *all* of them. That would mean reverting 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on PCIe features") completely, so of course we'd have to figure out how to resolve the AER message flood a different way. Bjorn > > > > > >>>> static void vmd_copy_host_bridge_flags(struct pci_host_bridge *root_bridge, > > > > > >>>> struct pci_host_bridge *vmd_bridge) > > > > > >>>> { > > > > > >>>> - vmd_bridge->native_pcie_hotplug = root_bridge->native_pcie_hotplug; > > > > > >>>> - vmd_bridge->native_shpc_hotplug = root_bridge->native_shpc_hotplug; > > > > > >>>> vmd_bridge->native_aer = root_bridge->native_aer; > > > > > >>>> vmd_bridge->native_pme = root_bridge->native_pme; > > > > > >>>> vmd_bridge->native_ltr = root_bridge->native_ltr;