On Mon, Sep 20, 2021 at 2:17 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: > > On Mon, Sep 20, 2021 at 01:52:21PM -0700, Dan Williams wrote: > > On Fri, Sep 17, 2021 at 10:05 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: > > > This deadlock was first reported with the zram driver, however the live > > > patching folks have acknowledged they have observed this as well with > > > live patching, when a live patch is removed. I was then able to > > > reproduce easily by creating a dedicated selftests. > > > > > > A sketch of how this can happen follows: > > > > > > CPU A CPU B > > > whatever_store() > > > module_unload > > > mutex_lock(foo) > > > mutex_lock(foo) > > > del_gendisk(zram->disk); > > > device_del() > > > device_remove_groups() > > > > This flow seems possible to trigger with: > > > > echo $dev > /sys/bus/$bus/drivers/$driver/unbind > > > > I am missing why module pinning > > The aspect of try_module_get() which comes to value to prevent the > deadlock is it ensures kernfs ops do not run once exit is on the way. > > > is part of the solution when it's the > > device_del() path that is racing? > > But its not, the device_del() path will yield until the kernfs op > completes. It is fine to wait there. > > The deadlock happens if a module exit routine uses a lock which is > also used on a sysfs op. If the lock was first held by module exit, > and module exit is waiting for the kernfs op to complete, and the > kernfs op is waiting to hold the same lock then the exit will wait > forever. > > > Module removal is just a more coarse > > grained way to trigger unbind => device_del(). > > Right, but the device_del() path is not sharing a lock with the sysfs op. The deadlock in the example comes from holding a lock over device_del() that is also taken in a kernfs op. For example, the code above looks like something that runs from driver.remove(), not exclusively module_exit(). Yes, module_exit() may trigger driver.remove() via driver_unregister(), but there's other ways to trigger driver.remove() that do not involve module_exit(). > > Isn't the above a bug > > in the driver, not missing synchronization in kernfs? > > We can certainly take the position as an alternative: > > "thou shalt not use a lock on exit which is also used on a syfs op" > > However that seems counter intuitive, specially if we can resolve the > issue easily with a try_module_get(). Again, I don't see how try_module_get() can affect the ABBA failure case of holding a lock over device_del() that is also held inside sysfs op. I agree that the problem is subtle. Does lockdep not complain about this case? If it's going to be avoided in the core it seems try_module_get() does not completely cover the hole that unsuspecting driver writers might fall into.