Recently, I discovered an issue with a PCIe device and recent kernels. Background: Using a Linux host system, the PCIe device, a Torrent QN16e PCIe 16-128 Channel QAM Modulator, is passed through to a FreeBSD guest. Until recently, this was working as expected. Issue: Starting with kernel v6.1.16, when the guest domain was started, the kernel would immediately report errors via vfio-pci. kernel: vfio-pci 0000:65:00.0: vfio_bar_restore: reset recovery - restoring BARs The guest would boot and its driver would load. Then, when guest user-space, would access the device, a PCI system error (PERR) was raised, as reported by the impi event log, and the hardware itself would suffer a catastrophic event, cycling the server. Discovery: After researching the issue, I found the commit that lead system error: https://lore.kernel.org/all/da77c92796b99ec568bd070cbe4725074a117038.1673769517.git.lukas@xxxxxxxxx/ Specifically, this removal: - Drop an unnecessary 1 sec delay from pci_reset_secondary_bus() which is now performed by pci_bridge_wait_for_secondary_bus(). A static delay this long is only necessary for Conventional PCI, so modern PCIe systems benefit from shorter reset times as a side effect. Resolution: I reintroduced the 1 second delay to pci_reset_secondary_bus, recompiled and installed and the system is now working as expected. void pci_reset_secondary_bus(struct pci_dev *dev) { u16 ctrl; pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &ctrl); ctrl |= PCI_BRIDGE_CTL_BUS_RESET; pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl); /* * PCI spec v3.0 7.6.4.2 requires minimum Trst of 1ms. Double * this to 2ms to ensure that we meet the minimum requirement. */ msleep(2); ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET; pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl); ssleep(1); } PCIe Device: https://www.videopropulsion.com/content/torrent-qn16e-pcie-16-128-channel-qam-modulator $ lspci 0000:65:00.0 Multimedia controller: Genroco, Inc Device 0004 (rev 01) Subsystem: Genroco, Inc Device 0004 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 32 bytes Interrupt: pin A routed to IRQ 349 NUMA node: 0 IOMMU group: 1 Region 0: Memory at e0e00000 (32-bit, non-prefetchable) [size=32K] Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [78] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [80] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 75W DevCtl: CorrErr- NonFatalErr- FatalErr+ UnsupReq- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend- LnkCap: Port #1, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <4us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1 TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR- 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp- ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled, AtomicOpsCtl: ReqEn- LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete- EqualizationPhase1- EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [100 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01 Status: NegoPending- InProgress- Capabilities: [200 v1] Vendor Specific Information: ID=1172 Rev=0 Len=044 <?> Kernel driver in use: vfio-pci Regards, Chad Schroeder