On Thu, Dec 13, 2018 at 10:38:14AM +0100, Rafael J. Wysocki wrote: > On Mon, Dec 10, 2018 at 9:47 AM Daniel Vetter <daniel.vetter@xxxxxxxx> wrote: > > > > Drivers might want to remove some sysfs files, which needs the same > > locks and ends up angering lockdep. Relevant snippet of the stack > > trace: > > > > kernfs_remove_by_name_ns+0x3b/0x80 > > bus_remove_driver+0x92/0xa0 > > acpi_video_unregister+0x24/0x40 > > i915_driver_unload+0x42/0x130 [i915] > > i915_pci_remove+0x19/0x30 [i915] > > pci_device_remove+0x36/0xb0 > > device_release_driver_internal+0x185/0x250 > > unbind_store+0xaf/0x180 > > kernfs_fop_write+0x104/0x190 > > Is the acpi_bus_unregister_driver() in acpi_video_unregister() the > source of the lockdep unhappiness? Yeah I guess I cut out too much of the lockdep splat. It complains about kernfs_fop_write and kernfs_remove_by_name_ns acquiring the same lock class. It's ofc not the same lock, so no real deadlock. Getting the device_release_driver outside of the callchain under kernfs_fop_write, which this patch does, "fixes" it. For "fixes" = shut up lockdep. Other options: - Anotate the recursion with the usual lockdep annotations. Potentially results in lockdep not catching real deadlocks (you can still have other loops closing the deadlock, maybe through some subsystem/bus lock). - Rewrite kernfs_fop_write to drop the lock (optionally, for callbacks that know what they're doing), which should be fine if we refcount everything properly (bus, driver & device). - Also note that probably the same bug exists on the bind sysfs interface, but we don't use that, so I don't care :-) - Most of these issues are never visible in normal usage, since normally driver bind/unbind is done from a kthread or model_load/unload, neither of which is running in the context of that kernfs mutex kernfs_fop_write holds. That's why I think the task work is the best solution, since it changes the locking context of the unbind sysfs to match the locking context of module unload and hotunplug. Unfortunately that trick doesn't work for the bind sysfs file, since that way we can't thread the errno value back to userspace. Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel