On Tue, 23 Jan 2024 20:59:50 +0100 "Rafael J. Wysocki" <rafael@xxxxxxxxxx> wrote: > On Tue, Jan 23, 2024 at 8:51 PM Alex Williamson > <alex.williamson@xxxxxxxxxx> wrote: > > > > On Tue, 23 Jan 2024 20:40:32 +0100 > > "Rafael J. Wysocki" <rafael@xxxxxxxxxx> wrote: > > > > > On Tue, Jan 23, 2024 at 7:56 PM Alex Williamson > > > <alex.williamson@xxxxxxxxxx> wrote: > > > > > > > > The commit noted in fixes added a bogus requirement that runtime PM > > > > managed devices need to be in the RPM_ACTIVE state for PME polling. > > > > In fact, only devices in low power states should be polled. > > > > > > > > However there's still a requirement that the device config space must > > > > be accessible, which has implications for both the current state of > > > > the polled device and the parent bridge, when present. It's not > > > > sufficient to assume the bridge remains in D0 and cases have been > > > > observed where the bridge passes the D0 test, but the PM state > > > > indicates RPM_SUSPENDING and config space of the polled device becomes > > > > inaccessible during pci_pme_wakeup(). > > > > > > > > Therefore, since the bridge is already effectively required to be in > > > > the RPM_ACTIVE state, formalize this in the code and elevate the PM > > > > usage count to maintain the state while polling the subordinate > > > > device. > > > > > > > > Cc: Lukas Wunner <lukas@xxxxxxxxx> > > > > Cc: Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx> > > > > Cc: Rafael J. Wysocki <rafael@xxxxxxxxxx> > > > > Fixes: d3fcd7360338 ("PCI: Fix runtime PM race with PME polling") > > > > Reported-by: Sanath S <sanath.s@xxxxxxx> > > > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218360 > > > > Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx> > > > > --- > > > > drivers/pci/pci.c | 37 ++++++++++++++++++++++--------------- > > > > 1 file changed, 22 insertions(+), 15 deletions(-) > > > > > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > > > > index bdbf8a94b4d0..764d7c977ef4 100644 > > > > --- a/drivers/pci/pci.c > > > > +++ b/drivers/pci/pci.c > > > > @@ -2433,29 +2433,36 @@ static void pci_pme_list_scan(struct work_struct *work) > > > > if (pdev->pme_poll) { > > > > struct pci_dev *bridge = pdev->bus->self; > > > > struct device *dev = &pdev->dev; > > > > - int pm_status; > > > > + struct device *bdev = bridge ? &bridge->dev : NULL; > > > > + int bref = 0; > > > > > > > > /* > > > > - * If bridge is in low power state, the > > > > - * configuration space of subordinate devices > > > > - * may be not accessible > > > > + * If we have a bridge, it should be in an active/D0 > > > > + * state or the configuration space of subordinate > > > > + * devices may not be accessible or stable over the > > > > + * course of the call. > > > > */ > > > > - if (bridge && bridge->current_state != PCI_D0) > > > > - continue; > > > > + if (bdev) { > > > > + bref = pm_runtime_get_if_active(bdev, true); > > > > + if (!bref) > > > > > > I would check bref <= 0 here. > > > > > > > + continue; > > > > + > > > > + if (bridge->current_state != PCI_D0) > > > > > > Isn't the power state guaranteed to be PCI_D0 at this point? If it > > > isn't, then why? > > > > Both of these seem necessary to support !CONFIG_PM, where bref would be > > -EINVAL and provides no indication of the current_state. Is that > > incorrect? Thanks, > > Well, CONFIG_PCIE_PME depends on CONFIG_PM, so I'm not sure how > dev->pme_poll can be set without it. I only see that drivers/pci/pci.c:pci_pm_init() sets pme_poll true and I'm not spotting a dependency on either PCIE_PME or PM to get there. I see a few places where pme.c, governed by PCIE_PME, can set pme_poll false though. It's also not clear to me that we should skip scanning a device if pm_runtime_get_if_active() returns -EINVAL for the bridge due to power.disable_depth. If runtime PM isn't enabled on the bridge, shouldn't we be able to test current_state and assume it won't change? Thanks, Alex