On Wed, 2016-11-23 at 15:59 +0800, Chris Chiu wrote: > ieee80211_iface_work() will check if sw scanning is in progress > before handling block ack session. In our case, the RTL8821AE > operate in station mode, when tx session expired, DELBA packet > stuck during sw scanning and so do other data packets. > > ieee80211_scan_state_decision() will take lots of time in > SCAN_SUSPEND/SCAN_RESUME state due to !tx_empty or bad_latency. > Then the sw scanning mostly take > 20 seconds to finish or even > worse in our case RTL8821AE have 37 channels for 2G+5G to scan > and tx stalls ~300 seconds. AP side still thinks the connection > is alive because it still receives the QoS NULL packet from STA. > So the link state will never change but actually no data tx/rx > during this long time. > > This commit tries to send out packet in SCAN_SUSPEND state so the > sw scanning can complete more efficiently and less affect on Block > Ack session handling. Verified on RTL8821AE for > 30000 pings and > no Tx BA session stuck observed. The premise seems fairly reasonable, although I'm a little worried that if so much new traffic is coming in we never finish the scan suspend? Actually, the queues are still stopped, so it's only management frames that can come in, so that should be ok? > + test_and_clear_bit(SCAN_SUSPEND_SCANNING, &local->scanning); > That makes no sense, you're not checking the return value, just clear the bit without test. > @@ -844,6 +846,8 @@ static void ieee80211_scan_state_suspend(struct > ieee80211_local *local, > /* disable PS */ > ieee80211_offchannel_return(local); > > + __set_bit(SCAN_SUSPEND_SCANNING, &local->scanning); Why are you using the non-atomic version here, vs. the atomic one above? johannes