On Mon, Feb 17, 2025 at 06:52:58PM +0200, Ilpo Järvinen wrote: > PCIe BW controller enables BW notifications for Downstream Ports by > setting Link Bandwidth Management Interrupt Enable (LBMIE) and Link > Autonomous Bandwidth Interrupt Enable (LABIE) (PCIe Spec. r6.2 sec. > 7.5.3.7). > > It was discovered that performing a reset can lead to the device > underneath the Downstream Port becoming unavailable if BW notifications > are left enabled throughout the reset sequence (at least LBMIE was > found to cause an issue). What kind of reset? FLR? SBR? This needs to be specified in the commit message so that the reader isn't forced to sift through a bugzilla with dozens of comments and attachments. The commit message should also mention the type of affected device (Nvidia GPU AD107M [GeForce RTX 4050 Max-Q / Mobile]). The Root Port above is an AMD one, that may be relevant as well. > While the PCIe Specifications do not indicate BW notifications could not > be kept enabled during resets, the PCIe Link state during an > intentional reset is not of large interest. Thus, disable BW controller > for the bridge while reset is performed and re-enable it after the > reset has completed to workaround the problems some devices encounter > if BW notifications are left on throughout the reset sequence. This approach won't work if the reset is performed without software intervention. E.g. if a DPC event occurs, the device likewise undergoes a reset but there is no prior system software involvement. Software only becomes involved *after* the reset has occurred. I think it needs to be tested if that same issue occurs with DPC. It's easy to simulate DPC by setting the Software Trigger bit: setpci -s 00:01.1 ECAP_DPC+6.w=40:40 If the issue does occur with DPC then this fix isn't sufficient. > Keep a counter for the disable/enable because MFD will execute > pci_dev_save_and_disable() and pci_dev_restore() back to back for > sibling devices: > > [ 50.139010] vfio-pci 0000:01:00.0: resetting > [ 50.139053] vfio-pci 0000:01:00.1: resetting > [ 50.141126] pcieport 0000:00:01.1: PME: Spurious native interrupt! > [ 50.141133] pcieport 0000:00:01.1: PME: Spurious native interrupt! > [ 50.441466] vfio-pci 0000:01:00.0: reset done > [ 50.501534] vfio-pci 0000:01:00.1: reset done So why are you citing the PME messages here? Are they relevant? Do they not occur when the bandwidth controller is disabled? If they do not, they may provide a clue what's going on. But that's not clear from the commit message. > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -5166,6 +5167,9 @@ static void pci_dev_save_and_disable(struct pci_dev *dev) > */ > pci_set_power_state(dev, PCI_D0); > > + if (bridge) > + pcie_bwnotif_disable(bridge); > + > pci_save_state(dev); Instead of putting this in the PCI core, amend pcie_portdrv_err_handler with ->reset_prepare and ->reset_done callbacks which call down to all the port service drivers, then amend bwctrl.c to disable/enable interrupts in these callbacks. > + port->link_bwctrl->disable_count--; > + if (!port->link_bwctrl->disable_count) { > + __pcie_bwnotif_enable(port); > + pci_dbg(port, "BW notifications enabled\n"); > + } > + WARN_ON_ONCE(port->link_bwctrl->disable_count < 0); So why do you need to count this? IIUC you get two consecutive disable and two consecutive enable events. If the interrupts are already disabled, just do nothing. Same for enablement. Any reason this simpler approach doesn't work? Thanks, Lukas