[+cc Mika, Rafael, linux-pci] Hi Matthias, Thanks a lot for this report! Mika, this bisected to upstream c2bf1fc212f7 ("PCI: Add missing link delays required by the PCIe spec"). Matthias, would you mind opening a separate report for the spurious PME issue you mentioned with 5.2.5? Seems like we should try to figure that one out, too. ----- Forwarded message from bugzilla-daemon@xxxxxxxxxxxxxxxxxxx ----- Date: Fri, 02 Aug 2019 16:26:45 +0000 From: bugzilla-daemon@xxxxxxxxxxxxxxxxxxx To: bugzilla.pci@xxxxxxxxx Subject: [Bug 204413] New: "PCI: Add missing link delays" causes regression on resume from suspend to ram Message-ID: <bug-204413-193951@xxxxxxxxxxxxxxxxxxxxxxxxx/> https://bugzilla.kernel.org/show_bug.cgi?id=204413 Bug ID: 204413 Summary: "PCI: Add missing link delays" causes regression on resume from suspend to ram Product: Drivers Version: 2.5 Kernel Version: 5.1.20, 5.2.5, 5.3-rc1? Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: PCI Assignee: drivers_pci@xxxxxxxxxxxxxxxxxxxx Reporter: matthias.andree@xxxxxx Regression: Yes Description of problem: vanilla 5.1.20 on x86_64 fails to wake from suspend (STR), Fedora and vanilla 5.1.19 and prior were fine. 5.2.5 (from Fedora's-200.fc30) also fails in a different way (spurious PME interrupts on pcie). How reproducible: always Steps to Reproduce: 1. boot Fedora 30 and log into GNOME desktop 2. click pause symbol to suspend the computer to RAM, wait until suspended 3. press key on keyboard, or power button Actual results: computer tries to wake up, HDD LED blinks a bit, but console does not wake. Other computer on network cannot ping the waking computer. "sync" hangs in D deep sleep for long amounts of time. Expected results: computer wakes up properly and continues to use its devices. Additional info: PM tracing was enabled, the next boot returned [ 0.827930] PM: hash matches drivers/base/power/main.c:1021 It appears that suspend to disk still works. Computer has an NVIDIA GeForce 1060 PCIe graphics board, but 5.1.19 and prior would suspend properly, and the 5.1.20 and 5.2.5 suspend issues also occur if nvidia kernel modules are renamed out of the way and nouveau remains blocked, so it's not an nvidia driver issue. I have "git bisect"ed this on the vanilla stable kernel, the stable/linux-5.1.y branch (because I have had starting points 5.1.19 and 5.1.20 there). The failure-inducing commit on the branch is 3c795a8e3481e4dec071b5956e7177e816f6e7f1 (see below), which got picked from master's c2bf1fc212f7e6f25ace1af8f0b3ac061ea48ba5, (merged through cf2d213e49fdf47e4c10dc629a3659e0026a54b8, v5.3-rc1~167) and also got picked to stable/linux-5.2.y 5817d78eba34f6c86f5462ae2c5212f80a013357 (v5.2.3~291). Sasha Levin's signoff is only on the stable branches, not on master. ------------------------------------------------------------ commit 3c795a8e3481e4dec071b5956e7177e816f6e7f1 (refs/bisect/bad) Author: Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx> 2019-06-12 12:57:38 Committer: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> 2019-07-26 09:12:37 Parent: 70cc29dba925b8a99a4917c2b5fa6702d0d496d1 (bpf: fix callees pruning callers) Child: a98c15177f72ae3c0a736bb324e66c279bf94899 (net: netsec: initialize tx ring on ndo_open) Branch: remotes/stable/linux-5.1.y Follows: v5.1.19 Precedes: v5.1.20 PCI: Add missing link delays required by the PCIe spec [ Upstream commit c2bf1fc212f7e6f25ace1af8f0b3ac061ea48ba5 ] Currently Linux does not follow PCIe spec regarding the required delays after reset. A concrete example is a Thunderbolt add-in-card that consists of a PCIe switch and two PCIe endpoints: +-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller +-01.0-[04-36]-- DS hotplug port +-02.0-[37]----00.0 xHCI controller \-04.0-[38-6b]-- DS hotplug port The root port (1b.0) and the PCIe switch downstream ports are all PCIe gen3 so they support 8GT/s link speeds. We wait for the PCIe hierarchy to enter D3cold (runtime): pcieport 0000:00:1b.0: power state changed by ACPI to D3cold When it wakes up from D3cold, according to the PCIe 4.0 section 5.8 the PCIe switch is put to reset and its power is re-applied. This means that we must follow the rules in PCIe 4.0 section 6.6.1. [...] Signed-off-by: Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> drivers/pci/pci.c | 29 +++++++++++++++++++---------- drivers/pci/pci.h | 1 + drivers/pci/pcie/portdrv_core.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 86 insertions(+), 10 deletions(-) -- You are receiving this mail because: You are watching the assignee of the bug. ----- End forwarded message -----