I am seeing a dead lock while loading enic driver with sriov enabled. CPU0 CPU1 --------------------------------------------------------------------- __driver_attach() device_lock(&dev->mutex) <--- device mutex lock here driver_probe_device() pci_enable_sriov() pci_iov_add_virtfn() pci_device_add() aer_isr() <--- pci aer error do_recovery() broadcast_error_message() pci_walk_bus() down_read(&pci_bus_sem) <--- rd sem down_write(&pci_bus_sem) <-- stuck on wr sem report_error_detected() device_lock(&dev->mutex)<--- DEAD LOCK This can also happen when aer error occurs while pci_dev->sriov_config() is called. Only fix I could think of is to lock &pci_bus_sem and try locking all device->mutex under that pci_bus. If it fails, unlock all device->mutex and &pci_bus_sem and try again. This approach seems to be hackish and I do not have better solution. I would like to open the discussion for this. Path 1 and 2 are code refactoring for pci locking api. Patch 3 fixes the issue. With current fix, we hold mutex lock of parent device and all the devices under the bus. This can exceed the size of held_locks in lockdep if number of devices (VFs) exceed 48. Patch 4 extends this 63, max supported by lockdep. Govindarajulu Varadarajan (4): pci: introduce __pci_walk_bus for caller with pci_bus_sem held pci: code refactor pci_bus_lock/unlock/trylock pci aer: fix deadlock in do_recovery lockdep: make MAX_LOCK_DEPTH configurable from Kconfig drivers/pci/bus.c | 13 ++++++++-- drivers/pci/pci.c | 38 ++++++++++++++++++++--------- drivers/pci/pcie/aer/aerdrv_core.c | 50 ++++++++++++++++++++++++++++++-------- fs/configfs/inode.c | 2 +- include/linux/pci.h | 18 ++++++++++++++ include/linux/sched.h | 3 +-- kernel/locking/lockdep.c | 13 +++++----- lib/Kconfig.debug | 10 ++++++++ 8 files changed, 115 insertions(+), 32 deletions(-) -- 2.14.1