On 29-11-2016 13:59, Johannes Berg wrote: > On Wed, 2016-11-23 at 15:59 +0800, Chris Chiu wrote: >> ieee80211_iface_work() will check if sw scanning is in progress >> before handling block ack session. In our case, the RTL8821AE >> operate in station mode, when tx session expired, DELBA packet >> stuck during sw scanning and so do other data packets. >> >> ieee80211_scan_state_decision() will take lots of time in >> SCAN_SUSPEND/SCAN_RESUME state due to !tx_empty or bad_latency. >> Then the sw scanning mostly take > 20 seconds to finish or even >> worse in our case RTL8821AE have 37 channels for 2G+5G to scan >> and tx stalls ~300 seconds. AP side still thinks the connection >> is alive because it still receives the QoS NULL packet from STA. >> So the link state will never change but actually no data tx/rx >> during this long time. >> >> This commit tries to send out packet in SCAN_SUSPEND state so the >> sw scanning can complete more efficiently and less affect on Block >> Ack session handling. Verified on RTL8821AE for > 30000 pings and >> no Tx BA session stuck observed. > > The premise seems fairly reasonable, although I'm a little worried that > if so much new traffic is coming in we never finish the scan suspend? > Actually, the queues are still stopped, so it's only management frames > that can come in, so that should be ok? So are pings a good way to verify block-ack session handling. How often was a scan issued within those 30000 pings or was that left to wpa_supplicant/network-manager/whatever? Regards, Arend >> + test_and_clear_bit(SCAN_SUSPEND_SCANNING, &local->scanning); >> > > That makes no sense, you're not checking the return value, just clear > the bit without test. > >> @@ -844,6 +846,8 @@ static void ieee80211_scan_state_suspend(struct >> ieee80211_local *local, >> /* disable PS */ >> ieee80211_offchannel_return(local); >> >> + __set_bit(SCAN_SUSPEND_SCANNING, &local->scanning); > > Why are you using the non-atomic version here, vs. the atomic one > above? > > johannes >