On Monday, June 17, 2019 4:53:48 PM CEST Lukas Wunner wrote: > On Mon, Jun 17, 2019 at 05:35:10PM +0300, Mika Westerberg wrote: > > Today when doing some PM testing I noticed that this patch actually > > reveals an issue in our native PME handling. Problem is in > > pcie_pme_handle_request() where we first convert req_id to struct > > pci_dev and then call pci_check_pme_status() for it. Now, when a device > > triggers wake the link is first brought up and then the PME is sent to > > root complex with req_id matching the originating device. However, if > > there are PCIe ports in the middle they may still be in D3 which means > > that pci_check_pme_status() returns 0xffff for the device below so there > > are lots of > > > > Spurious native interrupt" > > > > messages in the dmesg but the actual PME is never handled. > > > > It has been working because pci_check_pme_status() returned true in case > > of 0xffff as well and we went and runtime resumed to originating device. > > > > I think the correct way to handle this is actually drop the call to > > pci_check_pme_status() in pcie_pme_handle_request() because the whole > > idea of req_id in PME message is to allow the root complex and SW to > > identify the device without need to poll for the PME status bit. > > Either that or the call to pci_check_pme_status() should be encapsulated > in a pci_config_pm_runtime_get() / _put() pair. And the whole hierarchy might as well be resumed, which could be rather wasteful. The problem is that the $subject patch should affect polling only, but it has side effects beyond that.