Re: Possible PCI Regression Linux 5.3-rc1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 24, 2019 at 08:38:14AM -0500, Bjorn Helgaas wrote:
> On Wed, Jul 24, 2019 at 12:54:00PM +0000, Nicholas Johnson wrote:
> > Hi all,
> > 
> > I was just rebasing my patches for linux 5.3-rc1 and noticed a possible 
> > regression that shows on both of my machines. It is also reproducible 
> > with the unmodified Ubuntu mainline kernel, downloadable at [1].
> > 
> > Running the lspci command takes 1-3 seconds with 5.3-rc1 (rather than an 
> > imperceivable amount of time). Booting with pci.dyndbg does not reveal 
> > why.
> > 
> > $ uname -r
> > 5.3.0-050300rc1-generic
> > $ time lspci -vt 1>/dev/null
> > 
> > real	0m2.321s
> > user	0m0.026s
> > sys	0m0.000s
> > 
> > If none of you are aware of this or what is causing it, I will submit a 
> > bug report to Bugzilla.
> 
> I wasn't aware of this; thanks for reporting it!  I wasn't able to
> reproduce this in qemu.  Can you play with "strace -r lspci -vt" and
> the like?  Maybe try "lspci -n" to see if it's related to looking up
> the names?

For a second you had me doubting myself - it could have been a Ubuntu 
thing. But no, I just reproduced it on Arch Linux, and double checked 
that it was not doing it on 5.2. Also, the problem occurs even without 
the PCI kernel parameters which I usually pass.

Looking into this further, it seems that removing the Thunderbolt 
controller solves the issue, where XX is the bus after the root port:

$ echo 1 | sudo tee /sys/bus/pci/devices/0000\:XX\:00.0/remove

Removing the USB controller of the Thunderbolt controller alone can 
alleviate the problem for a few seconds, before it returns - I have no 
idea why. Removing the whole Thunderbolt controller from the root solves 
the problem indefinitely.

This is why you cannot reproduce it in QEMU - no Thunderbolt controller.

It could be a coincidence that it does it for Thunderbolt, but Mika 
Westerberg might be interested now.

Doing "lspci -n" makes no difference - it suffers the problem whenever 
the normal command does.

Doing "strace lspci -vt" unloaded a lot of information that I cannot 
summarise. But if you have access to a physical system with Thunderbolt, 
then you might be able to reproduce the issue and have a better chance 
of pinpointing the problem than I.

Thanks for looking at this.

Kind regards,
Nicholas



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux