On Wednesday, September 02, 2015 12:14:45 PM Tejun Heo wrote: > On Tue, Sep 01, 2015 at 03:12:34PM +0800, Jiang Liu wrote: > > Hi Rafael and Tejun, > > When running CPU hotplug tests, it triggers an lockdep warning > > as follow. The two possible deadlock paths are: > > 1) echo x > /sys/devices/system/cpu/cpux/online > > ->kernfs_fop_write() > > ->kernfs_get_active() > > 1.a) ->rwsem_acquire_read(&kn->dep_map, 0, 1, _RET_IP_); > > ->cpu_up() > > 1.b) ->cpu_hotplug_begin()[lock_map_acquire(&cpu_hotplug.dep_map)] > > 2) hardware triggers hotplug evetns > > ->acpi_device_hotplug() > > ->acpi_processor_remove() > > 2.a) ->cpu_hotplug_begin()[lock_map_acquire(&cpu_hotplug.dep_map)] > > ->unregister_cpu() > > ->device_del() > > ->kernfs_remove_by_name_ns() > > ->__kernfs_remove() > > ->kernfs_drain() > > 2.b) ->rwsem_acquire(&kn->dep_map, 0, 0, _RET_IP_) > > > > So there is a possible deadlock scenario among 1.a, 1.b, 2.a and 2.b. > > I'm not familiar with kernfs, so could you please help to comment: > > 1) whether is a real deadlock issue? > > Yes, it seems to be. It's highly unlikely but still possible. Hmm. So acpi_device_hotplug() calls lock_device_hotplug() which simply acquires device_hotplug_lock. It is held throughout the entire hot-add/hot-remove code path. Witing anything to /sys/devices/system/cpu/cpux/online goes through online_store() in drivers/base/core.c and that does lock_device_hotplug_sysfs() which then attempts to acquire device_hotplug_lock using mutex_trylock(). And it only calls either device_online() or device_offline() if it ends up with the lock held. Quite frankly, I don't see how these particular two code paths can deadlock in any way. So either a third code path is involved which is not executed under device_hotplug_lock, or lockdep needs to be told to actually take device_hotplug_lock into account in this case IMO. > > 2) any recommended way to get it fixed? > > This usually happens with "delete" files and it's worked around by > performing special self-removal on the file before actually removing > the device. I suppose on/offline files would need to turn off > active_protection with kernfs_[un]break_active_protection() which > should probably grow sysfs and device layer wrappers. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html