On 5/15/21 4:44 AM, Brian Norris wrote:
It would seem like _anyone_ that calls cfg80211_unregister_wdev() with an interface up will hit this -- not unique to mwifiex. In fact, apart from the fact that all his line numbers are wrong, Maximilian's original email points out exactly where the deadlock is. cfg80211_unregister_wdev() holds the wiphy lock, and the GOING_DOWN notification also tries to grab it. It does happen that in many other paths, you've already ensured that you bring the interface down, so e.g., mac80211 drivers don't tend to hit this. But I wouldn't be surprised if a few other cfg80211 drivers hit this too. The best solution I could figure was to do a similar lock dance done in nl80211_del_interface() -- close the netdev without holding the wiphy lock. I'll send out a patch shortly.
I believe that if we're going to fix that in the individual drivers, there should be at least some sort of warning/documentation on cfg80211_unregister_wdev(). Also someone might want to look at other WiFi drivers calling cfg80211_unregister_wdev(). For example, I can see a locked call in the brcm80211 driver, but no previous dev_close() call (see [1]). Haven't looked in detail though, so I might just be wrong. I can't help but think that this should maybe be addressed in that common part instead. I know too little of that subsystem to tell if that might be infeasible though. Regards, Max [1]: https://elixir.bootlin.com/linux/v5.13-rc1/source/drivers/net/wireless/broadcom/brcm80211/brcmfmac/p2p.c#L2445