On Wed, Jun 05, 2019 at 05:58:19PM +0300, Mika Westerberg wrote: > PME polling does not take into account that a device that is directly > connected to the host bridge may go into D3cold as well. This leads to a > situation where the PME poll thread reads from a config space of a > device that is in D3cold and gets incorrect information because the > config space is not accessible. > > Here is an example from Intel Ice Lake system where two PCIe root ports > are in D3cold (I've instrumented the kernel to log the PMCSR register > contents): > > [ 62.971442] pcieport 0000:00:07.1: Check PME status, PMCSR=0xffff > [ 62.971504] pcieport 0000:00:07.0: Check PME status, PMCSR=0xffff > > Since 0xffff is interpreted so that PME is pending, the root ports will > be runtime resumed. This repeats over and over again essentially > blocking all runtime power management. > > Prevent this from happening by checking whether the device is in D3cold > before its PME status is read. There's more broken here. The below patch fixes a PME polling race and should also fix the issue you're witnessing, could you verify that? The patch has been rotting on my development branch for several months, I just didn't get around to posting it, my apologies. -- >8 -- Subject: [PATCH] PCI / PM: Fix race on PME polling Since commit df17e62e5bff ("PCI: Add support for polling PME state on suspended legacy PCI devices"), the work item pci_pme_list_scan() polls the PME status flag of devices and wakes them up if the bit is set. The function performs a check whether a device's upstream bridge is in D0 for otherwise the device is inaccessible, rendering PME polling impossible. However the check is racy because it is performed before polling the device. If the upstream bridge runtime suspends to D3hot after pci_pme_list_scan() checks its power state and before it invokes pci_pme_wakeup(), the latter will read the PMCSR as "all ones" and mistake it for a set PME status flag. I am seeing this race play out as a Thunderbolt controller going to D3cold and occasionally immediately going to D0 again because PM polling was performed at just the wrong time. Avoid by checking for an "all ones" PMCSR in pci_check_pme_status(). Fixes: 58ff463396ad ("PCI PM: Add function for checking PME status of devices") Signed-off-by: Lukas Wunner <lukas@xxxxxxxxx> Cc: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> Cc: stable@xxxxxxxxxxxxxxx # v2.6.34+ --- drivers/pci/pci.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index b98a564..2e05348 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1753,6 +1753,8 @@ bool pci_check_pme_status(struct pci_dev *dev) pci_read_config_word(dev, pmcsr_pos, &pmcsr); if (!(pmcsr & PCI_PM_CTRL_PME_STATUS)) return false; + if (pmcsr == ~0) + return false; /* Clear PME status. */ pmcsr |= PCI_PM_CTRL_PME_STATUS; -- 2.20.1