On 13 January 2017 at 08:24, Johannes Berg <johannes@xxxxxxxxxxxxxxxx> wrote: > >> Unless you then continue to use that sta pointer after you release >> data_lock. > > Ouch, ok. That's rather strangely hidden though. > >> Consider this: >> >> > CPU0 CPU1 >> > 1 synchronize_net() >> > 2 drv_sta_state() >> > 3 htt_fetch_ind(pid,tid) called >> > 4 rcu_read_lock() >> > 5 get(data_lock) >> > 6 txq=peers[pid]->sta->txq[tid] >> > 7 put(data_lock) >> > 8 get(data_lock) >> > 9 peer->sta=0 >> > 10 put(data_lock) >> > 11 kfree(sta) >> > 12 ieee80211_tx_dequeue(txq) >> >> Even though there's no code like (9) per se you can think of it as >> anything that tries to "remove" the peer--sta association >> (ath10k_peer >> is removed implicitly via wmi peer delete command and waiting for htt >> event completion). >> >> Holding data_lock for the entire duration of handling fetch >> indication isn't really good for performance so it's better to fix >> RCU handling. > > Yeah, I see it now - also the comment where this happens. You probably > should mark some things __rcu though and actually use RCU primitives, > so the code is actually understandable :) Good point. I'll do that in another patch. Michał