Imre Deak wrote: [..] > > > Also Imre tried with 2 PCI patches together https://patchwork.freedesktop.org/series/134193/ > > > And still not good for those 4 systems (mtlp-9, bat-dg2-13/14 and bat-adlp-11) : > > > https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_134193v1/index.html? > > > Dave, Dan, thoughts? > > > > Can you provide the dmesg from the failure system with the 2 patches applied please? > > For the above 4 machines, mtlp-9 not having the originally reported WARN > (at pci.c:4886) only some other lockdep issue, while the other 3 > machines having both the originally reported one and the other lockdep > issue: > https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_134193v1/bat-mtlp-9/boot0.txt This one does not seem to implicate cfg_access_lock at all. I wonder if you revert the lockdep annotation completely if it still fails. I.e. this is a new lockdep report for v6.10-rc independent of this new cfg_access_lock annotation. > https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_134193v1/bat-dg2-13/boot0.txt > https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_134193v1/bat-dg2-14/boot0.txt > https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_134193v1/bat-adlp-11/boot0.txt These are all identical and are pointing out that vmd, via pci_reset_bus(), has long been performing an unlocked secondary bus reset that userspace could race and confuse the kernel. I think the fix for that is below, but this is an increasingly spicy level of change that gives me some pause, i.e. teach pci_bus_lock() to lock the bridge itself: -- 8< -- diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 59e0949fb079..ac3999bc59e8 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5442,6 +5442,7 @@ static void pci_bus_lock(struct pci_bus *bus) { struct pci_dev *dev; + pci_dev_lock(bus->self); list_for_each_entry(dev, &bus->devices, bus_list) { pci_dev_lock(dev); if (dev->subordinate) @@ -5459,6 +5460,7 @@ static void pci_bus_unlock(struct pci_bus *bus) pci_bus_unlock(dev->subordinate); pci_dev_unlock(dev); } + pci_dev_unlock(bus->self); } /* Return 1 on successful lock, 0 on contention */ @@ -5466,6 +5468,7 @@ static int pci_bus_trylock(struct pci_bus *bus) { struct pci_dev *dev; + pci_dev_lock(bus->self); list_for_each_entry(dev, &bus->devices, bus_list) { if (!pci_dev_trylock(dev)) goto unlock; @@ -5484,6 +5487,7 @@ static int pci_bus_trylock(struct pci_bus *bus) pci_bus_unlock(dev->subordinate); pci_dev_unlock(dev); } + pci_dev_unlock(bus->self); return 0; }