Hi, On Fri, 2017-03-24 at 09:56 +0100, Johannes Berg wrote: > On Thu, 2017-03-23 at 16:29 +0100, Johannes Berg wrote: > > Isn't it possible for the following to happen? > > > > CPU1 CPU2 > > > > mutex_lock(&M); // acquires mutex > > full_proxy_xyz(); > > srcu_read_lock(&debugfs_srcu); > > real_fops->xyz(); > > mutex_lock(&M); // waiting for mutex > > debugfs_remove(F); > > synchronize_srcu(&debugfs_srcu); > So I'm pretty sure that this can happen. I'm not convinced that it's > happening here, but still. I'm a bit confused, in that SRCU, of course, doesn't wait until all the readers are done - that'd be a regular reader/writer lock or something. However, it does (have to) wait until all the currently active read- side sections have terminated, which still leads to a deadlock in the example above, I think? In his 2006 LWN article Paul wrote: The designer of a given subsystem is responsible for: (1) ensuring that SRCU read-side sleeping is bounded and (2) limiting the amount of memory waiting for synchronize_srcu(). [1] In the case of debugfs files acquiring locks, (1) can't really be guaranteed, especially if those locks can be held while doing synchronize_srcu() [via debugfs_remove], so I still think the lockdep annotation needs to be changed to at least have some annotation at synchronize_srcu() time so we can detect this. Now, I still suspect there's some other bug here in the case that I'm seeing, because I don't actually see the "mutex_lock(&M); // waiting" piece in the traces. I'll need to run this with some tracing on Monday when the test guys are back from the weekend. I'm also not sure how I can possibly fix this in debugfs in mac80211 and friends, but that's perhaps a different story. Clearly, this debugfs patch is a good thing - the code will likely have had use- after-free problems in this situation without it. But flagging the potential deadlocks would make it a lot easier to find them. johannes