On Thu, 2017-08-03 at 13:02 +0300, Kalle Valo wrote: > "Coelho, Luciano" <luciano.coelho@xxxxxxxxx> writes: > > > On Thu, 2017-08-03 at 11:10 +0200, Jiri Kosina wrote: > > > On Mon, 31 Jul 2017, Jiri Kosina wrote: > > > > > > > Hi, > > > > > > > > booting current Linus' tree, I'm seeing lockdep splat (see the end of this > > > > mail). > > > > > > > > Apparently, there is AB-BA between tz->lock and mvm->mutex through the CPU > > > > hotplug lock. > > > > > > > > The obivous depency is: thermal_zone_get_temp() acquires tz->lock, and > > > > then calls iwl_mvm_tzone_get_temp() (through tz->ops->get_temp() > > > > callback), which acquires mvm->mutex > > > > > > > > The less obvious dependency is primarily caused by iwl_op_mode_mvm_start() > > > > allocating workqueue (#2 stacktrace) while holding mvm->mutex (which is > > > > broken, because that mutex is being taken also from CPU hotplug callback > > > > path, hence the AB-BA). > > > > > > As the "central" part of the dependency is being added by iwlwifi driver > > > (_iwl_pcie_rx_init() allocating workqueue while holding > > > trans_pcie->mutex), I'm adding iwlwifi folks as well to CC. [...] > > > > -> #2 (cpu_hotplug_lock.rw_sem){++++++}: > > > > lock_acquire+0xbd/0x220 > > > > cpus_read_lock+0x46/0x90 > > > > apply_workqueue_attrs+0x17/0x50 > > > > __alloc_workqueue_key+0x195/0x4d0 > > > > _iwl_pcie_rx_init+0x384/0x390 [iwlwifi] > > > > iwl_pcie_rx_init+0x1e/0x380 [iwlwifi] > > > > iwl_trans_pcie_start_fw+0x295/0x6f0 [iwlwifi] > > > > iwl_mvm_load_ucode_wait_alive+0xe7/0x390 [iwlmvm] > > > > iwl_run_init_mvm_ucode+0x84/0x320 [iwlmvm] > > > > iwl_op_mode_mvm_start+0x964/0xd30 [iwlmvm] > > > > _iwl_op_mode_start.isra.9+0x47/0xa0 [iwlwifi] > > > > iwl_opmode_register+0xaa/0xd0 [iwlwifi] > > > > iwl_mvm_init+0x37/0x1000 [iwlmvm] > > > > do_one_initcall+0x51/0x1a9 > > > > do_init_module+0x60/0x20e > > > > load_module+0x203f/0x2b50 > > > > SYSC_finit_module+0x96/0xd0 > > > > SyS_finit_module+0xe/0x10 > > > > entry_SYSCALL_64_fastpath+0x23/0xc2 Okay, so as I understand it the problem has been there for a long time, but the splat is only coming up now because of Thomas' patch that adds the lockdep map[1], right? I see the workqueue allocation you mentioned. I'll try to move this allocation out of the mutex and see how it goes. [1] http://lkml.kernel.org/r/20170524081549.709375845@xxxxxxxxxxxxx -- Cheers, Luca.