Hi Hans, Hans de Goede wrote: > Hi All, > > On 5/29/24 12:21 AM, Bjorn Helgaas wrote: > > On Tue, May 28, 2024 at 09:02:05PM +0200, Hans de Goede wrote: > >> Hi, > >> > >> While testing 6.10-rc1 on a Dell XPS 13 plus 9320 I noticed the following PCI lockdep warnings: > > > > Thanks for the report! I think/hope this is the same as > > > > https://lore.kernel.org/r/171659995361.845588.6664390911348526329.stgit@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > which is queued on pci/for-linus. Let me know if you still see it > > with that patch. > > So I have tried with both related fixes from pci/for-linus added to 6.10.0-rc1+, > unfortunately this does not fix things. > > Note my original logs contained 2 stack-traces, which I now realize might > be 2 different issues: > > 1) The lock_map_assert_held(&dev->cfg_access_lock); check in pci_bridge_secondary_bus_reset() > (drivers/pci/pci.c:4886) triggering because it is called without that lock held So this one is a valid catch and comes from the fact that pci_reset_bus() has never bothered to lock out config cycles over the reset. The problem though is that to fix this the best place to take that missing lock is pci_bus_lock(). Recursive locking is not amenable to lockdep without lock subclass annotations. > > 2) A mutex being unlocked while it was not locked, I thought this locking unbalance also > triggered 1) but I think I was wrong there. This one is another side effect of the pci_bus_lock() problem where lockdep cannot tell the difference between pci_dev_lock(parent) and pci_dev_lock(child). Fixing it would either need a lockdep key per device (likely too runtime expensive) or "nested" lock annotations in pci_bus_lock() (likely too invasive). With those observations I decided to fall back to just catching "missing cfg_access_lock" occurences and skip the full lockdep support for cfg_access_lock: https://lore.kernel.org/linux-cxl/171709637423.1568751.11773767969980847536.stgit@xxxxxxxxxxxxxxxxxxxxxxxxx/ > With the 2 fixes from pci/for-linus added to 6.10.0-rc1+, 1) still happens and > 2) is replaced with a lockdep warning about circular locking now. So we still > have 2 issues and 1) is unchanged. > > Here are the new logs of 6.10.0-rc1 with the 2 fixes from pci/for-linus. I'm leaving > the old logs without the fixes in place further below in for reference. Thanks for the detailed reports, and apologies for the thrash.