On Thu, 2017-03-23 at 16:29 +0100, Johannes Berg wrote: > Isn't it possible for the following to happen? > > CPU1 CPU2 > > mutex_lock(&M); > full_proxy_xyz(); > srcu_read_lock(&debugfs_srcu); > real_fops->xyz(); > mutex_lock(&M); > debugfs_remove(F); > synchronize_srcu(&debugfs_srcu); So I'm pretty sure that this can happen. I'm not convinced that it's happening here, but still. I tried to make lockdep flag it, but the only way I could get it to flag it was to do this: --- a/include/linux/srcu.h +++ b/include/linux/srcu.h @@ -235,7 +235,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp) preempt_disable(); retval = __srcu_read_lock(sp); preempt_enable(); - rcu_lock_acquire(&(sp)->dep_map); + lock_map_acquire(&(sp)->dep_map); return retval; } @@ -249,7 +249,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp) static inline void srcu_read_unlock(struct srcu_struct *sp, int idx) __releases(sp) { - rcu_lock_release(&(sp)->dep_map); + lock_map_release(&(sp)->dep_map); __srcu_read_unlock(sp, idx); } diff --git a/kernel/rcu/srcu.c b/kernel/rcu/srcu.c index ef3bcfb15b39..0f9e542ca3f2 100644 --- a/kernel/rcu/srcu.c +++ b/kernel/rcu/srcu.c @@ -395,6 +395,9 @@ static void __synchronize_srcu(struct srcu_struct *sp, int trycount) lock_is_held(&rcu_sched_lock_map), "Illegal synchronize_srcu() in same-type SRCU (or in RCU) read-side critical section"); + lock_map_acquire(&sp->dep_map); + lock_map_release(&sp->dep_map); + might_sleep(); init_completion(&rcu.completion); The lock_map_acquire() in srcu_read_lock() is really not desired though, since it will make recursion get flagged as bad. If I change that to lock_map_acquire_read() though, the problem doesn't get flagged for some reason. I thought it should. Regardless though, I don't see a way to solve this problem for debugfs. We have a ton of debugfs files in net/mac80211/debugfs.c that need to acquire e.g. the RTNL (or other locks), and I'm not sure we can easily avoid removing the debugfs files under the RTNL, since we get all our configuration callbacks with the RTNL already held... Need to think about that, but perhaps there's some other solution? johannes