Re: [PATCH v4] PCI: vmd: Do not change the BIOS Hotplug setting on VMD rootports

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On Tue, Sep 19, 2023 at 11:31:57AM +0800, Kai-Heng Feng wrote:
> On Wed, Sep 13, 2023 at 8:50 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> [snipped]
> > Hmm.  In some ways the VMD device acts as a Root Port, since it
> > originates a new hierarchy in a separate domain, but on the upstream
> > side, it's just a normal endpoint.
> >
> > How does AER for the new hierarchy work?  A device below the VMD can
> > generate ERR_COR/ERR_NONFATAL/ERR_FATAL messages.  I guess I was
> > assuming those messages would terminate at the VMD, and the VMD could
> > generate an AER interrupt just like a Root Port.  But that can't be
> > right because I don't think VMD would have the Root Error Command
> > register needed to manage that interrupt.
> VMD itself doesn't seem to manage AER, the rootport that "moved" from
> 0000 domain does:
> [ 2113.507345] pcieport 10000:e0:06.0: AER: Corrected error received:
> 10000:e1:00.0
> [ 2113.507380] nvme 10000:e1:00.0: PCIe Bus Error: severity=Corrected,
> type=Physical Layer, (Receiver ID)
> [ 2113.507389] nvme 10000:e1:00.0:   device [144d:a80a] error
> status/mask=00000001/0000e000
> [ 2113.507398] nvme 10000:e1:00.0:    [ 0] RxErr                  (First)

Oh, I forgot how VMD works.  It sounds like there *is* a Root Port
that is logically below the VMD, e.g., (from

  ACPI: PCI Root Bridge [PC00] (domain 0000 [bus 00-e0])
  acpi PNP0A08:00: _OSC: platform does not support [AER]
  acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR]
  pci  0000:00:0e.0: [8086:467f] type 00         # VMD
  vmd  0000:00:0e.0: PCI host bridge to bus 10000:e0
  pci 10000:e0:06.0: [8086:464d] type 01         # Root Port to [bus e1]
  pci 10000:e1:00.0: [144d:a80a] type 00         # Samsung NVMe

So ERR_* messages from the e1:00.0 Samsung device would terminate at
the e0:06.0 Root Port.  That Root Port has an AER Capability with Root
Error Command/Status/Error Source registers.

> > But if VMD just passes those messages up to the Root Port, the source
> > of the messages (the Requester ID) won't make any sense because
> > they're in a hierarchy the Root Port doesn't know anything about.
> Not sure what's current status is but I think Nirmal's patch is valid
> for both our cases.

So I think the question is whether that PNP0A08:00 _OSC applies to
domain 10000.  I think the answer is "no" because the platform doesn't
know about the existence of domain 10000, and it can't access config
space in that domain.

E.g., if _OSC negotiated that the platform owned AER in domain 0000, I
don't think it would make sense for that to mean the platform *also*
owned AER in domain 10000, because the platform doesn't know how to
configure AER or handle AER interrupts in that domain.

Nirmal's patch ignores _OSC for hotplug, but keeps the _OSC results
for AER, PME, and LTR.  I think we should ignore _OSC for *all* of

That would mean reverting 04b12ef163d1 ("PCI: vmd: Honor ACPI _OSC on
PCIe features") completely, so of course we'd have to figure out how
to resolve the AER message flood a different way.


> > > > > >>>>  static void vmd_copy_host_bridge_flags(struct pci_host_bridge *root_bridge,
> > > > > >>>>                                         struct pci_host_bridge *vmd_bridge)
> > > > > >>>>  {
> > > > > >>>> -        vmd_bridge->native_pcie_hotplug = root_bridge->native_pcie_hotplug;
> > > > > >>>> -        vmd_bridge->native_shpc_hotplug = root_bridge->native_shpc_hotplug;
> > > > > >>>>          vmd_bridge->native_aer = root_bridge->native_aer;
> > > > > >>>>          vmd_bridge->native_pme = root_bridge->native_pme;
> > > > > >>>>          vmd_bridge->native_ltr = root_bridge->native_ltr;

[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux