Re: Question about deadlock between AER and pceihp interrupts during resume from S3 with unplugged device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 11, 2022 at 02:42:21PM +0000, Kumar1, Rahul wrote:
> We can some changes we can see in lspci from working to non-working case. Below are changes
> Link Speed =  8GT/s  -> 2.5GT/s.
> DLActive+   ->     DLActive-
> BWMgmt+   -> BWMgmt+
> PresDet+ -> PresDet+
> EqualizationComplete+ -> EqualizationComplete+
> 
> Also when we do reset via sysfs, we don't see this issue.
> 
> I have created bug here https://bugzilla.kernel.org/show_bug.cgi?id=215590

So with the patches applied, the link doesn't come up after resume,
but if you then reset via sysfs, it does come up, is that what you're
saying?

The dmesg excerpt Andrey posted shows an AER splat after resume (even
with the patches applied):

[   69.684921] pcieport 0000:00:01.1: AER: Root Port link has been reset
[   69.691438] pcieport 0000:00:01.1: AER: Device recovery failed
[   69.697327] pcieport 0000:00:01.1: AER: Multiple Uncorrected (Fatal) error received: 0000:00:01.0
[   69.706231] pcieport 0000:00:01.1: AER: can't find device of ID0008

I suspect the Root Port refuses to train the link due to that fatal
error.  Perhaps Kai-Heng Feng's patch is incomplete and it needs to
clear stale AER errors?  Or maybe it re-enables AER too early?

Could you attach lspci -vv output before/after suspend to the bugzilla?
And also attach full dmesg output with the patches applied?

Thanks,

Lukas



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux