On Fri, 28 Feb 2025, Lukas Wunner wrote: > On Thu, Feb 27, 2025 at 06:31:08AM +0100, Lukas Wunner wrote: > > pcie_bwnotif_irq() accesses the Link Status register without > > acquiring a runtime PM reference on the PCIe port. This feels > > wrong and may also contribute to the issue reported here. > > Acquiring a runtime PM ref may sleep, so I think you need to > > change the driver to use a threaded IRQ handler. > > I've realized we've had a discussion before why a threaded IRQ handler > doesn't make sense... Yes. > https://lore.kernel.org/all/Z35qJ3H_8u5LQDJ6@xxxxxxxxx/ > > ...but I'm still worried that a Downstream Port in a nested-switch > configuration may be runtime suspended while the hardirq handler > is running. Is there anything preventing that from happening? I don't think there is. > To access config space of a port, it's sufficient if its upstream > bridge is runtime active (i.e. in PCI D0). > > So basically the below is what I have in mind. This assumes that > the upstream bridge is still in D0 when the interrupt handler runs > because in atomic context we can't wait for it to be runtime resumed. > Seems like a fair assumption to me but what do I know... bwctrl doesn't even want to resume the port in the irqhandler. If the port is suspended, why would it have LBMS/LABS, and we disabled notifications anyway in suspend handler anyway so we're not even expecting them to come during a period of suspend (which does not mean there couldn't be interrupts due to other sources). So there should be no problem in not calling resume for it. > -- >8 -- > > diff --git a/drivers/pci/pcie/bwctrl.c b/drivers/pci/pcie/bwctrl.c > index 0a5e7efbce2c..fea8f7412266 100644 > --- a/drivers/pci/pcie/bwctrl.c > +++ b/drivers/pci/pcie/bwctrl.c > @@ -28,6 +28,7 @@ > #include <linux/mutex.h> > #include <linux/pci.h> > #include <linux/pci-bwctrl.h> > +#include <linux/pm_runtime.h> > #include <linux/rwsem.h> > #include <linux/slab.h> > #include <linux/types.h> > @@ -235,9 +236,13 @@ static irqreturn_t pcie_bwnotif_irq(int irq, void *context) > struct pcie_device *srv = context; > struct pcie_bwctrl_data *data = srv->port->link_bwctrl; > struct pci_dev *port = srv->port; > + struct device *parent __free(pm_runtime_put) = port->dev.parent; > u16 link_status, events; > int ret; > > + if (parent) > + pm_runtime_get_noresume(parent); > + Should this then check if its suspended and return early if it is suspended? pm_runtime_suspended() has some caveats in the kerneldoc though so I'm a bit unsure if it can be called safely here, probably not. > ret = pcie_capability_read_word(port, PCI_EXP_LNKSTA, &link_status); > if (ret != PCIBIOS_SUCCESSFUL) > return IRQ_NONE; > diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h > index d39dc863f612..038228de773d 100644 > --- a/include/linux/pm_runtime.h > +++ b/include/linux/pm_runtime.h > @@ -448,6 +448,8 @@ static inline int pm_runtime_put(struct device *dev) > return __pm_runtime_idle(dev, RPM_GET_PUT | RPM_ASYNC); > } > > +DEFINE_FREE(pm_runtime_put, struct device *, if (_T) pm_runtime_put(_T)) > + > /** > * __pm_runtime_put_autosuspend - Drop device usage counter and queue autosuspend if 0. > * @dev: Target device. > -- i.