On 9/16/22 9:38 AM, Alexander Wetzel wrote:
On 15.09.22 22:39, Ben Greear wrote:
From reading the original patch description, it was to stop an NPE when AP was stopped. I have been testing
this patch below and it fixes the problems I saw with multiple vdevs. I was worried that
the code in the 'list_for_each_entry_rcu(sta, &local->sta_list, list) {' might still need to run
to keep everything in sync (and my patch allows that to happen), but I do not know if that is true or not.
I'm not sure if it's still save to wake the queues for the vif we are tearing down and assumed the intend was to skip those, too.
But it looks like all stations for the vif are deleted prior to setting sdata->bss = NULL, so the outcome should be the same.
Your solution removes potentially set IEEE80211_TXQ_STOP_NETIF_TX flags, reducing the risk that a queue restart during vif setup ends up with inoperable queues.
But the only way to set IEEE80211_TXQ_STOP_NETIF_TX seems to be during ieee80211_tx_dequeue(). Which should not be possible to be called as long as
SDATA_STATE_RUNNING was never set for the vif.
But I'm on thin ice here:-)
I spent two days debugging this and have only barest understanding of how all of the atf/fq/txq logic
is supposed to work.
But, I think my test case was a bit different from yours, and in my test case, before my patch,
it failed 100% of the time:
create two station vdevs on same (mtk7916) radio
admin one up (this works)
admin second one up (this fails, tx path is hung, because sdata->vif.txqs_stopped[ac] was true).
In general, I'd like to keep start/stop state as in-sync everywhere as possible, and I think my patch
might be better at that than yours since it goes through the sta list (and, maybe too, I'm completely
wrong about that).
Potentially, static void __ieee80211_wake_txqs(struct ieee80211_sub_if_data *sdata, int ac)
should just be called each time a vdev goes admin up.
But since my original test case failed so reliably, the __ieee80211_wake_txqs method must not actually be
called when secondary vdevs go admin up. I am not sure if that is per design or some other
bug.
Hoping the 2-3 people who understand this logic well will chime in :)
Thanks,
Ben
--
Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc http://www.candelatech.com