Ben Greear <greearb@xxxxxxxxxxxxxxx> writes: > On 02/01/2018 02:47 PM, Johannes Berg wrote: >> On Thu, 2018-02-01 at 23:40 +0100, Johannes Berg wrote: >>> >>> The code does a plain rcu_dereference(), no locking other than >>> rcu_read_lock() involved. >>> >>> On second thought though, I'm not convinced that your modifications >>> caused the problem. >>> >>> Given your call stack, we'd expect rcu_read_lock() somewhere around >>> ath_tid_dequeue (or its caller(s)), since ieee80211_tx_dequeue clearly >>> requires it. >>> >>> Normally, ieee80211_tx_dequeue() is called from various places that >>> probably come from mac80211 and already hold the rcu_read_lock(), e.g. >>> the wake_tx_queue op. >>> >>> In this case, you're coming from drv_sta_state, so not sure why the >>> driver thinks it's OK to call the dequeue there. >> >> Just to clarify - it could just be that in the "normal" case, when a >> station dies, there's nothing on the queues - so the dequeue just >> doesn't do anything and never goes into the code path where the >> rcu_dereference() is, hence it might be a bug in mainline that just >> never triggers in ordinary scenarios. > > It looks like the code in ath9k has not been changed in that area for > some time. Hmm, I think the issue here is that after the switch to mac80211 txqs, the driver is now draining mac80211 queues, whereas before it was only draining its own driver-internal queues (which are protected by the ath_txq_lock() that ath_tx_node_cleanup() does grab). And when the switch to the mac80211 queues happened, a call to rcu_read_lock() should have been added. But, well, I had no idea this was needed until just now, so obviously I didn't add that... :) > Would adding rcu_read_lock in drv_sta_state() be a possibility? Think it should probably just be added in ath_tx_node_cleanup()? Can send a patch... -Toke