Search Linux Wireless

Re: [PATCH] mac80211: fix Tx BA session stuck issue during sw scanning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 29, 2016 at 8:59 PM, Johannes Berg
<johannes@xxxxxxxxxxxxxxxx> wrote:
> On Wed, 2016-11-23 at 15:59 +0800, Chris Chiu wrote:
>> ieee80211_iface_work() will check if sw scanning is in progress
>> before handling block ack session. In our case, the RTL8821AE
>> operate in station mode, when tx session expired, DELBA packet
>> stuck during sw scanning and so do other data packets.
>>
>> ieee80211_scan_state_decision() will take lots of time in
>> SCAN_SUSPEND/SCAN_RESUME state due to !tx_empty or bad_latency.
>> Then the sw scanning mostly take > 20 seconds to finish or even
>> worse in our case RTL8821AE have 37 channels for 2G+5G to scan
>> and tx stalls ~300 seconds. AP side still thinks the connection
>> is alive because it still receives the QoS NULL packet from STA.
>> So the link state will never change but actually no data tx/rx
>> during this long time.
>>
>> This commit tries to send out packet in SCAN_SUSPEND state so the
>> sw scanning can complete more efficiently and less affect on Block
>> Ack session handling. Verified on RTL8821AE for > 30000 pings and
>> no Tx BA session stuck observed.
>
> The premise seems fairly reasonable, although I'm a little worried that
> if so much new traffic is coming in we never finish the scan suspend?
> Actually, the queues are still stopped, so it's only management frames
> that can come in, so that should be ok?
>

Actually it will finish scan eventually and back to SCAN_DECISION
state but almost 20~30 seconds elapsed. The local->scanning should be
cleared after all channels been scanned. However, from the debug
messages I added in ieee80211_iface_work(), it still returns when
check local->scanning and the DELBA still has no chance to be
transferred then stuck again at the next scan state machine. Supposed
to be another scan request issued but I don't know who's the killer.
Except to find who issue the next scan request, BA session have no
chance to reset in this long scan period (>20s) still need to be taken
care.

>> +     test_and_clear_bit(SCAN_SUSPEND_SCANNING, &local->scanning);
>>
>
> That makes no sense, you're not checking the return value, just clear
> the bit without test.
>
>> @@ -844,6 +846,8 @@ static void ieee80211_scan_state_suspend(struct
>> ieee80211_local *local,
>>       /* disable PS */
>>       ieee80211_offchannel_return(local);
>>
>> +     __set_bit(SCAN_SUSPEND_SCANNING, &local->scanning);
>
> Why are you using the non-atomic version here, vs. the atomic one
> above?
>
You're right. I just want to clear_bit and set_bit in this case, sorry
for that confusing. Or any better suggestion?

> johannes



[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux