On Mon, Sep 20, 2021 at 02:36:38PM -0700, Bart Van Assche wrote: > On 9/17/21 10:04 PM, Luis Chamberlain wrote: > > A sketch of how this can happen follows: > > > > CPU A CPU B > > whatever_store() > > module_unload > > mutex_lock(foo) > > mutex_lock(foo) > > del_gendisk(zram->disk); > > device_del() > > device_remove_groups() > > > > In this situation whatever_store() is waiting for the mutex foo to > > become unlocked, but that won't happen until module removal is complete. > > But module removal won't complete until the sysfs file being poked > > completes which is waiting for a lock already held. > > If I remember correctly I encountered the deadlock scenario described > above for the first time about ten years ago while working on the SCST > project. We solved this deadlock by removing the sysfs attributes from > the module unload code before grabbing mutex_lock(foo), e.g. by calling > sysfs_remove_file(). Well the sysfs attributes in zram do tons of funky mucking around so unfortunately no. It's not the only driver where this can happen. It is why I decided to work on a generic solution instead. > This works because calling sysfs_remove_file() > multiple times in a row is safe. Is that solution good enough for the > zram driver? The sysfs attributes are group attributes part of the block, and so are removed for the driver on a del_gendisk(). So unfortunately no, this would not be a good solution in this case. Luis