Re: PCI core lockdep warning with 6.10-rc1 on Dell XPS 13 plus 9320

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Hans,

Hans de Goede wrote:
> Hi All,
> 
> On 5/29/24 12:21 AM, Bjorn Helgaas wrote:
> > On Tue, May 28, 2024 at 09:02:05PM +0200, Hans de Goede wrote:
> >> Hi,
> >>
> >> While testing 6.10-rc1 on a Dell XPS 13 plus 9320 I noticed the following PCI lockdep warnings:
> > 
> > Thanks for the report!  I think/hope this is the same as
> > 
> >   https://lore.kernel.org/r/171659995361.845588.6664390911348526329.stgit@xxxxxxxxxxxxxxxxxxxxxxxxx
> > 
> > which is queued on pci/for-linus.  Let me know if you still see it
> > with that patch.
> 
> So I have tried with both related fixes from pci/for-linus added to 6.10.0-rc1+,
> unfortunately this does not fix things.
> 
> Note my original logs contained 2 stack-traces, which I now realize might
> be 2 different issues:
> 
> 1) The lock_map_assert_held(&dev->cfg_access_lock); check in pci_bridge_secondary_bus_reset()
>    (drivers/pci/pci.c:4886) triggering because it is called without that lock held

So this one is a valid catch and comes from the fact that
pci_reset_bus() has never bothered to lock out config cycles over the
reset. The problem though is that to fix this the best place to take
that missing lock is pci_bus_lock(). Recursive locking is not amenable
to lockdep without lock subclass annotations.

> 
> 2) A mutex being unlocked while it was not locked, I thought this locking unbalance also
>    triggered 1) but I think I was wrong there.

This one is another side effect of the pci_bus_lock() problem where
lockdep cannot tell the difference between pci_dev_lock(parent) and
pci_dev_lock(child). Fixing it would either need a lockdep key per
device (likely too runtime expensive) or "nested" lock annotations in
pci_bus_lock() (likely too invasive).

With those observations I decided to fall back to just catching
"missing cfg_access_lock" occurences and skip the full lockdep support
for cfg_access_lock:

https://lore.kernel.org/linux-cxl/171709637423.1568751.11773767969980847536.stgit@xxxxxxxxxxxxxxxxxxxxxxxxx/

> With the 2 fixes from pci/for-linus added to 6.10.0-rc1+, 1) still happens and
> 2) is replaced with a lockdep warning about circular locking now. So we still
> have 2 issues and 1) is unchanged.
> 
> Here are the new logs of 6.10.0-rc1 with the 2 fixes from pci/for-linus. I'm leaving
> the old logs without the fixes in place further below in for reference.

Thanks for the detailed reports, and apologies for the thrash.




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux