On 5/31/22 12:08 PM, Felix Fietkau wrote:
When a vif is being removed and sdata->bss is cleared, __ieee80211_wake_txqs
can still be called on it, which crashes as soon as sdata->bss is being
dereferenced.
To fix this properly, check for SDATA_STATE_RUNNING before waking queues,
and take the fq lock when setting it (to ensure that __ieee80211_wake_txqs
observes the change when running on a different CPU
I patched this into my 5.17+ kernel, and in a test that brings up 16 virtual
station vdevs on an mtk7915, 4 of them on each of two radios will not associate.
They get 4-way timeouts.
So, I think there must be something wrong with assumptions in this patch, or
maybe it depends on some other patch I am missing. I'll remove it from my tree...
Thanks,
Ben
Signed-off-by: Felix Fietkau <nbd@xxxxxxxx>
---
net/mac80211/iface.c | 2 ++
net/mac80211/util.c | 3 +++
2 files changed, 5 insertions(+)
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 41531478437c..15a73b7fdd75 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -377,7 +377,9 @@ static void ieee80211_do_stop(struct ieee80211_sub_if_data *sdata, bool going_do
bool cancel_scan;
struct cfg80211_nan_func *func;
+ spin_lock_bh(&local->fq.lock);
clear_bit(SDATA_STATE_RUNNING, &sdata->state);
+ spin_unlock_bh(&local->fq.lock);
cancel_scan = rcu_access_pointer(local->scan_sdata) == sdata;
if (cancel_scan)
diff --git a/net/mac80211/util.c b/net/mac80211/util.c
index 1e26b5235add..dad42d42aa84 100644
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -301,6 +301,9 @@ static void __ieee80211_wake_txqs(struct ieee80211_sub_if_data *sdata, int ac)
local_bh_disable();
spin_lock(&fq->lock);
+ if (!test_bit(SDATA_STATE_RUNNING, &sdata->state))
+ goto out;
+
if (sdata->vif.type == NL80211_IFTYPE_AP)
ps = &sdata->bss->ps;
--
Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc http://www.candelatech.com