On Wed, Jun 05, 2019 at 09:05:57PM +0200, Lukas Wunner wrote: > On Wed, Jun 05, 2019 at 05:58:19PM +0300, Mika Westerberg wrote: > > PME polling does not take into account that a device that is directly > > connected to the host bridge may go into D3cold as well. This leads to a > > situation where the PME poll thread reads from a config space of a > > device that is in D3cold and gets incorrect information because the > > config space is not accessible. > > > > Here is an example from Intel Ice Lake system where two PCIe root ports > > are in D3cold (I've instrumented the kernel to log the PMCSR register > > contents): > > > > [ 62.971442] pcieport 0000:00:07.1: Check PME status, PMCSR=0xffff > > [ 62.971504] pcieport 0000:00:07.0: Check PME status, PMCSR=0xffff > > > > Since 0xffff is interpreted so that PME is pending, the root ports will > > be runtime resumed. This repeats over and over again essentially > > blocking all runtime power management. > > > > Prevent this from happening by checking whether the device is in D3cold > > before its PME status is read. > > There's more broken here. The below patch fixes a PME polling race > and should also fix the issue you're witnessing, could you verify that? It fixes the issue but I needed to tune it a bit -> > The patch has been rotting on my development branch for several months, > I just didn't get around to posting it, my apologies. Better late than never :) > -- >8 -- > Subject: [PATCH] PCI / PM: Fix race on PME polling > > Since commit df17e62e5bff ("PCI: Add support for polling PME state on > suspended legacy PCI devices"), the work item pci_pme_list_scan() polls > the PME status flag of devices and wakes them up if the bit is set. > > The function performs a check whether a device's upstream bridge is in > D0 for otherwise the device is inaccessible, rendering PME polling > impossible. However the check is racy because it is performed before > polling the device. If the upstream bridge runtime suspends to D3hot > after pci_pme_list_scan() checks its power state and before it invokes > pci_pme_wakeup(), the latter will read the PMCSR as "all ones" and > mistake it for a set PME status flag. I am seeing this race play out as > a Thunderbolt controller going to D3cold and occasionally immediately > going to D0 again because PM polling was performed at just the wrong > time. > > Avoid by checking for an "all ones" PMCSR in pci_check_pme_status(). > > Fixes: 58ff463396ad ("PCI PM: Add function for checking PME status of devices") > Signed-off-by: Lukas Wunner <lukas@xxxxxxxxx> > Cc: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx # v2.6.34+ > --- > drivers/pci/pci.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index b98a564..2e05348 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -1753,6 +1753,8 @@ bool pci_check_pme_status(struct pci_dev *dev) > pci_read_config_word(dev, pmcsr_pos, &pmcsr); > if (!(pmcsr & PCI_PM_CTRL_PME_STATUS)) > return false; > + if (pmcsr == ~0) <- Here I needed to do if (pmcsr == (u16)~0) I think it is because pmcsr is u16 so we end up comparing: 0xffff == 0xffffffff > + return false; > > /* Clear PME status. */ > pmcsr |= PCI_PM_CTRL_PME_STATUS; > -- > 2.20.1