On Fri, 2022-03-25 at 22:16 +0100, Johannes Berg wrote: > > > Thread 1 Thread 2 > > nl80211_pre_doit(): > > rtnl_lock() > > wiphy_lock() nl80211_pre_doit(): > > rtnl_lock() // blocked by Thread 1 > > nl80211_vendor_cmd(): > > doit() > > cfg80211_unregister_netdevice() > > rtnl_unlock(): > > netdev_run_todo(): > > __rtnl_unlock() > > <got RTNL lock> > > wiphy_lock() // blocked by Thread 1 > > rtnl_lock(); // DEADLOCK > > nl80211_post_doit(): > > wiphy_unlock(); > > > Right, this is what I had discussed in my other mails. > > Basically, you're actually doing (some form of) unregister_netdevice() > before rtnl_unlock(). > > Clearly this isn't possible in cfg80211 itself. > > However, I couldn't entirely discount the possibility that this is > possible: > > Thread 1 Thread 2 > rtnl_lock() > unregister_netdevice() > __rtnl_unlock() > rtnl_lock() > wiphy_lock() > netdev_run_todo() > __rtnl_unlock() > // list not empty now > // because of thread 2 rtnl_lock() > rtnl_lock() > wiphy_lock() > > ** DEADLOCK ** > > > Given my other discussion with Jakub though, it seems that we can indeed > make sure that this cannot happen, and then this scenario is impossible > without the unregistration you're doing. > I just sent a patch for this then, forgot to CC everyone: https://lore.kernel.org/r/20220325225055.37e89a72f814.Ic73d206e217db20fd22dcec14fe5442ca732804b@changeid But basically it changes nothing, just adds a WARN_ON with documentation ensuring that the invariant never breaks, i.e. that Thread 2 can't happen. And maybe I should've written that with 3 Threads, but the setup of unregister_netdevice()/__rtnl_unlock() could happen anywhere in the system anyway. johannes