Please disregard this v3. The spinlock is misplaced in ht.c. Sorry. -----Message d'origine----- De : Jean-Pierre TOSONI <jp.tosoni@xxxxxxxxx> Envoyé : jeudi 28 octobre 2021 11:15 À : Johannes Berg (johannes@xxxxxxxxxxxxxxxx) <johannes@xxxxxxxxxxxxxxxx>; 'linux-wireless@xxxxxxxxxxxxxxx' <linux-wireless@xxxxxxxxxxxxxxx> Objet : [RFC v3] mac80211: fix rx blockack session race condition When the mac80211 layer is used with ath10k, the following may happen: a) radio card firmware receives ADDBA-REQ from peer b) radio sends back ADDBA-RESP to peer and signals the ath10k driver c) ath10k calls back ieee80211_manage_rx_ba_offl() in mac80211 to signal rx ba offloading d) mac80211::agg-rx.c::ieee80211_manage_rx_ba_offl() d1) sets a flag: sta->ampdu_mlme.tid_rx_manage_offl d2) queues a call to ht.c::ieee80211_ba_session_work() e) ...scheduler runs... f) ht.c::ieee80211_ba_session_work() checks the flag, clears it and sets up the rx ba session. During (e), a fast peer may already have sent a BAREQ which is propagated to rx.c::ieee80211_rx_h_ctrl(). Since the session is not yet established, mac80211 sends back a DELBA to the peer, which can hang the BA session. The phenomenon can be observed between two QCA988X fw 10.2.4 radios, using a loop of associate/arping from client to AP/disconnect. After a few thousand loops, arping does not get a response and a sniffer detects a DELBA action frame from the client, following an ADDBA. Fix: 1) check the offload flag in addition to the check for a valid aggregation session 2) protect the paired checks of (offload flag, valid aggregation session) with a spinlock against interference from ieee80211_ba_session_work(). Note 1: there is another dubious DELBA generation in ieee80211_rx_reorder_ampdu(), where the same kind of fix should fit, but I did not fix it since I knew no easy way to test. Note 2: this fix applies to wireless backports from 5.4-rc8. --- V2: remove debugging code leftovers, sorry for that V3: use spin_lock_bh instead of a mutex Index: b/net/mac80211/rx.c =================================================================== --- a/net/mac80211/rx.c +++ b/net/mac80211/rx.c @@ -3085,11 +3085,18 @@ ieee80211_rx_h_ctrl(struct ieee80211_rx_ tid = le16_to_cpu(bar_data.control) >> 12; + spin_lock_bh(&rx->sta->ampdu_mlme.rx_offl_lock); if (!test_bit(tid, rx->sta->ampdu_mlme.agg_session_valid) && - !test_and_set_bit(tid, rx->sta->ampdu_mlme.unexpected_agg)) + /* back_req is allowed if the fw just received addba */ + !test_bit(tid, rx->sta->ampdu_mlme.tid_rx_manage_offl) && + !test_and_set_bit(tid, rx->sta->ampdu_mlme.unexpected_agg)) { + spin_unlock_bh(&rx->sta->ampdu_mlme.rx_offl_lock); ieee80211_send_delba(rx->sdata, rx->sta->sta.addr, tid, WLAN_BACK_RECIPIENT, WLAN_REASON_QSTA_REQUIRE_SETUP); + } else { + spin_unlock_bh(&rx->sta->ampdu_mlme.rx_offl_lock); + } tid_agg_rx = rcu_dereference(rx->sta->ampdu_mlme.tid_rx[tid]); if (!tid_agg_rx) Index: b/net/mac80211/ht.c =================================================================== --- a/net/mac80211/ht.c +++ b/net/mac80211/ht.c @@ -367,11 +367,13 @@ void ieee80211_ba_session_work(struct wo IEEE80211_MAX_AMPDU_BUF_HT, false, true, NULL); + spin_lock_bh(&sta->ampdu_mlme.rx_offl_lock); if (test_and_clear_bit(tid + IEEE80211_NUM_TIDS, sta->ampdu_mlme.tid_rx_manage_offl)) ___ieee80211_stop_rx_ba_session( sta, tid, WLAN_BACK_RECIPIENT, 0, false); + spin_unlock_bh(&sta->ampdu_mlme.rx_offl_lock); spin_lock_bh(&sta->lock); Index: b/net/mac80211/sta_info.c =================================================================== --- a/net/mac80211/sta_info.c +++ b/net/mac80211/sta_info.c @@ -354,6 +354,7 @@ struct sta_info *sta_info_alloc(struct i spin_lock_init(&sta->lock); spin_lock_init(&sta->ps_lock); + spin_lock_init(&sta->ampdu_mlme.rx_offl_lock); INIT_WORK(&sta->drv_deliver_wk, sta_deliver_ps_frames); INIT_WORK(&sta->ampdu_mlme.work, ieee80211_ba_session_work); mutex_init(&sta->ampdu_mlme.mtx); Index: b/net/mac80211/sta_info.h =================================================================== --- a/net/mac80211/sta_info.h +++ b/net/mac80211/sta_info.h @@ -266,6 +266,7 @@ struct tid_ampdu_rx { * @mtx: mutex to protect all TX data (except non-NULL assignments * to tid_tx[idx], which are protected by the sta spinlock) * tid_start_tx is also protected by sta->lock. + * @rx_offl_lock: protects transfer from tid_rx_manage_offl to agg_session_valid * @tid_rx: aggregation info for Rx per TID -- RCU protected * @tid_rx_token: dialog tokens for valid aggregation sessions * @tid_rx_timer_expired: bitmap indicating on which TIDs the @@ -287,6 +288,7 @@ struct tid_ampdu_rx { struct sta_ampdu_mlme { struct mutex mtx; /* rx */ + spinlock_t rx_offl_lock; struct tid_ampdu_rx __rcu *tid_rx[IEEE80211_NUM_TIDS]; u8 tid_rx_token[IEEE80211_NUM_TIDS]; unsigned long tid_rx_timer_expired[BITS_TO_LONGS(IEEE80211_NUM_TIDS)]; -- quilt 0.63