Hi Chad, On Tue, Sep 19, 2023 at 03:08:08PM +0000, Schroeder, Chad wrote: > 0000:64:00.0 PCI bridge: Intel Corporation Sky Lake-E PCI Express Root Port A (rev 07) (prog-if 00 [Normal decode]) [...] > Bus: primary=64, secondary=65, subordinate=65, sec-latency=0 [...] > Capabilities: [90] Express (v2) Root Port (Slot+), MSI 00 [...] > LnkCap: Port #5, Speed 8GT/s, Width x16, ASPM L1, Exit Latency L1 <16us > ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+ > LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+ > ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- > LnkSta: Speed 2.5GT/s, Width x1 > TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt- pci_bridge_secondary_bus_reset() calls pcibios_reset_secondary_bus() to perform the SBR, then calls pci_bridge_wait_for_secondary_bus() to perform the delays prescribed by the spec. At the bottom of pci_bridge_wait_for_secondary_bus(), there's an if-else clause and based on the lspci output quoted above, I'd expect that control flow enters the else branch (because the SkyLake Root Port supports more than 5 GT/s, namely 8 GT/s. The function then calls pcie_wait_for_link_delay() to wait for the link to come up, then waits 100 msec per PCIe r6.1 sec 6.6.1. Afterwards it calls pci_dev_wait() to poll the Vendor ID register of the VideoPropulsion card. My guess is that the VideoPropulsion card requires a longer delay than 100 msec before its config space is first accessed. The PCI system errors that you mention are probably raised by the card because it is not ready yet to handle those config space accesses. Since this is a PCIe r1.0 card, I've checked whether PCIe r1.0 has longer delays after reset than contemporary revisions of the PCIe Base Spec. But that's not the case. PCIe r1.0 sec 7.6 says: "To allow components to perform internal initialization, system software must wait for at least 100 ms from the end of a reset (cold/warm/hot) before it is permitted to issue Configuration Requests to PCI Express devices [...] The Root Complex and/or system software must allow 1.0s (+50% / -0%) after a reset (hot/warm/cold), before it may determine that a device which fails to return a Successful Completion status for a valid Configuration Request is a broken device" Those timing requirements are essentially identical to what contemporary PCIe revisions prescribe. It's also what the code in the kernel follows. Which leads me to believe that the longer delay before the first config space access required by this particular card is a quirk. So I'm proposing the following patch: diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 20ac67d..3cbff71 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5993,3 +5993,9 @@ static void dpc_log_size(struct pci_dev *dev) DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2f, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a31, dpc_log_size); #endif + +static void pci_fixup_d3cold_delay_1sec(struct pci_dev *pdev) +{ + pdev->d3cold_delay = 1000; +} +DECLARE_PCI_FIXUP_FINAL(0x5555, 0x0004, pci_fixup_d3cold_delay_1sec); Could you test if applying this on top of v6.1.16 fixes the issue? (Apply with "patch -p1 < filename.patch" at the top of the kernel source tree.) Thanks, Lukas