Re: regression: PCIe resume from suspend stalls I/O and causes interrupt storms in Linux 5.3-rc2 (5.2.5, 5.1.20) on Ryzen 7 1700/AMD X370 MSI board since 5817d78eba34f6c86f5462ae2c5212f80a013357, 5.2/5.3 w/ pcieIRQ loop.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 02, 2019 at 09:03:06PM +0200, Matthias Andree wrote:
> Greetings, 

Hi,

> Commit 5817d78eba34f6c86f5462ae2c5212f80a013357 (written by Mika
> Westerberg) causes regressions on resume from S3 suspend on my MSI X370
> w/ Ryzen 7 1700, which is, TTBOMK, a PCI Express 3.0 platform.
> Consequences are hung disk and net I/O although re-login to GNOME works
> on 5.1.20, albeit very slowly. The machine is unusable after resume from
> that point.
> 
> 5.2.5 and 5.3-rc2 will go into a tight loop of pcieport 0000:00:01.3:
> PME: Spurious native interrupt! and need to be rebooted.
> 
> bad: v5.3-rc2
> 
> good: v5.3-rc2-111-g97b00aff2c45 + "git revert 5817d78eba"
> 
> Reverting that commit shown above restores suspend functionality for me,
> two S3 suspend/resume cycles work.
> 
> For details, more information (lspci, versions found) is at:
> 
> * Kernel Bugzilla, https://bugzilla.kernel.org/show_bug.cgi?id=204413
> 
> * Fedora/Redhat Bugzilla,
> https://bugzilla.redhat.com/show_bug.cgi?id=1737046
> 
> 
> Same findings for v5.2.5 on stable kernel, reverting the relevant commit
> (SHA is 5817d78eba34f6c86f5462ae2c5212f80a013357 there) also fixes
> suspend/resume problems for me.
> 
> Let me know if you need me to pull out any further hardware or kernel
> debug info, but please be specific with instructions - I am not a kernel
> hacker (although I have been exposed to C for nearly 30 years and
> Linux/FreeBSD for some 20 years). Pointing me to relevant URLs with
> debug instructions is fine. I have a Git tree handy and this octocore
> sitting here compiles a kernel in < 10 minutes.

Are you able to get dmesg after resume or is it completely dead? It
would help you we could see how long it tries to wait for the downstream
link by passing "pciepordrv.dyndbg" to the kernel command line.

Can you also try to revert 00ebf1348cb332941dab52948f29480592bfbe6a
("PCI/PME: Replace dev_printk(KERN_DEBUG) with dev_info()") so that it
does not spam dmesg too much?

Thanks!



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux