On Tue, Jun 04, 2024 at 08:20:53AM -0700, Dan Williams wrote: > Imre Deak wrote: > > Hi, > > > > [Sorry for the previous message, resending it now > > with proper In-reply-to: header added.] > > > > I see a similar issue, a corruption in the lock_keys_hash while > > alloc_workqueue()->lockdep_register_key() iterates it, see [1] for the > > stacktrace. > > > > Not sure if related or even will solve [1], but [2] will revert > > > > commit 7e89efc6e9e4 ("PCI: Lock upstream bridge for pci_reset_function()") > > > > which does > > > > lockdep_register_key(&dev->cfg_access_key); > > > > in pci_device_add() and doesn't unregister the key when the pci device is > > removed (and potentially freed); so basically 7e89efc6e9e4 was missing a > > > > lockdep_unregister_key(); > > > > in pci_destroy_dev(). > > > > Based on the above I wonder if 7e89efc6e9e4 could also lead to the > > corruption of lock_keys_hash after a pci device is removed.o > > Are you running with the revert applied and still seeing issues? The revert is not yet applied and so [1] happened with a kernel containing 7e89efc6e9e4. [1] https://intel-gfx-ci.01.org/tree/drm-tip/IGT_7875/bat-atsm-1/dmesg0.txt