On 12 January 2017 at 16:46, Johannes Berg <johannes@xxxxxxxxxxxxxxxx> wrote: > On Thu, 2017-01-12 at 16:14 +0100, Michal Kazior wrote: >> Station pointers are RCU protected so driver must >> be extra careful if it tries to store them >> internally for later use outside of the RCU >> section it obtained it in. >> >> It was possible for station teardown to race with >> some htt events. The possible outcome could be a >> use-after-free and a crash. >> >> Only peer-flow-control capable firmware was >> affected (so hardware-wise qca99x0 and qca4019). >> >> This could be done in sta_state() itself via >> explicit synchronize_net() call but there's >> already a convenient sta_pre_rcu_remove() op that >> can be hooked up to avoid extra rcu stall. > > I don't think this makes sense. You're not using RCU-protected pointers > to the stations yourself, all accesses to them are locked under the > &ar->data_lock. As a consequence, you can't have any need for waiting > for a grace period. Since you also remove the pointer (under lock) when > the station gets removed, I don't think RCU can be the problem? Unless you then continue to use that sta pointer after you release data_lock. Consider this: > CPU0 CPU1 > 1 synchronize_net() > 2 drv_sta_state() > 3 htt_fetch_ind(pid,tid) called > 4 rcu_read_lock() > 5 get(data_lock) > 6 txq=peers[pid]->sta->txq[tid] > 7 put(data_lock) > 8 get(data_lock) > 9 peer->sta=0 > 10 put(data_lock) > 11 kfree(sta) > 12 ieee80211_tx_dequeue(txq) Even though there's no code like (9) per se you can think of it as anything that tries to "remove" the peer--sta association (ath10k_peer is removed implicitly via wmi peer delete command and waiting for htt event completion). Holding data_lock for the entire duration of handling fetch indication isn't really good for performance so it's better to fix RCU handling. Michał