Search Linux Wireless

Re: [RFT] ath9k: multi-rate-retry fails at HW level

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01.12.20 14:33, Toke Høiland-Jørgensen wrote:
> Zefir Kurtisi <zefir.kurtisi@xxxxxxxxxxxx> writes:
> 
>> CC += adrian
>>
>> On 24.11.20 15:45, Toke Høiland-Jørgensen wrote:
>>> Zefir Kurtisi <zefku@xxxxxxxxxxxx> writes:
>>>
>>>> Hi,
>>>>
>>>> I am running into a strange issue with the ath9k operating a 9590
>>>> device which to me seems like a HW issue, but since work on rate
>>>> controllers is already going for decades, I hardly can imagine this
>>>> never showed up.
>>>>
>>>> The issue observed is this: the TX status descriptors never report
>>>> rateindex 1, it is always 0, 2, or 3, but never 1.
>>>>
>>>> I noticed this by overwriting the rate configuration provided by
>>>> minstrel to a static setup, e.g. (7,3)(5,3)(3,3)(1,3), all MCS. The
>>>> device operates as iperf client to a connected AP and continuously
>>>> transmits data. While at that, the attenuation between the endpoints
>>>> is gradually increased, expecting to see a gradual shift in the
>>>> reported TX status rateindex from 0 to 3. But nada, the values
>>>> reported are 0,2, and 3 - never 1.
>>>>
>>>> I double checked that the TX descriptors are correctly set with the
>>>> rates and retry counts - all looking sane.
>>>>
>>>> More obvious, after changing the rate configuration to
>>>> (7,3)(1,3)(5,3)(3,3) the expectation would be to have either 0 or 1
>>>> reported as rateidx, since the transmission ought to be successful
>>>> with the lowest rate or never. Again all rates are reported but 1.
>>>>
>>>> Now the question for me is: what is the HW exactly doing with such a
>>>> configuration? Is it skipping the second rate, or is it just reporting
>>>> wrong?
>>>
>>> You should be able to see this by looking at the rates the frames are
>>> being sent at, shouldn't you?
>>>
>> Yes, did that and from there it points to that the second rate is just skipped.
>>
>> Here are some use cases and their sniffing results. Setup is a 11ng STA connected
>> to AP with the attenuation adjusted such that MCS 7 fails, while MCS 5 and below
>> succeed. Monitor is sniffing while sending a single ping from AP to STA.
>>
>> With a rate configuration of (7/2)(3/2)(1/2) we get:
>> 14:02:42.923880 9481489761us tsft 2412 MHz 11n -68dBm signal 65.0 Mb/s MCS 7 20
>> MHz long GI RX-STBC0 -68dBm signal antenna 0 Data IV:  e Pad 20 KeyID 0
>> 14:02:42.923909 9481490037us tsft 2412 MHz 11n -69dBm signal 65.0 Mb/s MCS 7 20
>> MHz long GI RX-STBC0 -69dBm signal antenna 0 Data IV:  e Pad 20 KeyID 0
>> 14:02:42.925244 9481491044us tsft 2412 MHz 11n -68dBm signal 13.0 Mb/s MCS 1 20
>> MHz long GI RX-STBC0 -68dBm signal antenna 0 Data IV:  e Pad 20 KeyID 0
>>
>>
>> with (7/2)(1/2)(3/2):
>> 13:59:37.073147 9295637087us tsft 2412 MHz 11n -69dBm signal 65.0 Mb/s MCS 7 20
>> MHz long GI RX-STBC0 -69dBm signal antenna 0 Data IV:  c Pad 20 KeyID 0
>> 13:59:37.073467 9295637438us tsft 2412 MHz 11n -69dBm signal 65.0 Mb/s MCS 7 20
>> MHz long GI RX-STBC0 -69dBm signal antenna 0 Data IV:  c Pad 20 KeyID 0
>> 13:59:37.074591 9295638498us tsft 2412 MHz 11n -68dBm signal 26.0 Mb/s MCS 3 20
>> MHz long GI RX-STBC0 -68dBm signal antenna 0 Data IV:  c Pad 20 KeyID 0
>>
>> and with (7/2)(3/2):
>> 14:04:27.269806 9585836783us tsft 2412 MHz 11n -69dBm signal 65.0 Mb/s MCS 7 20
>> MHz long GI RX-STBC0 -69dBm signal antenna 0 Data IV: 10 Pad 20 KeyID 0
>> 14:04:27.270342 9585837344us tsft 2412 MHz 11n -68dBm signal 65.0 Mb/s MCS 7 20
>> MHz long GI RX-STBC0 -68dBm signal antenna 0 Data IV: 10 Pad 20 KeyID 0
>> 14:04:27.271368 9585838370us tsft 2412 MHz 11n -68dBm signal 65.0 Mb/s MCS 7 20
>> MHz long GI RX-STBC0 -68dBm signal antenna 0 Data IV: 10 Pad 20 KeyID 0
>> [..]
>>
>> a total of 14 attempts at MCS 7 with the ping finally failing.
>>
>>>> Both possibilities have great impact, since upper layers (like
>>>> airtime) use the returned rateidx to calculate and configure operating
>>>> parameters at runtime.
>>>
>>> Have you actually observed any issues from this? If it's just skipping a
>>> rate, minstrel should still be able to make decisions based on the
>>> actual values returned, no?
>>>
>> The issues arise from the fact that the driver reports a
>> (tx-rateindex/tx-attemp-index) per TX descriptor, leaving the driver to calculate
>> what was put on air based on these two values. If one had rates set to
>> (7/2)(3/7)(1/2) and the TX status reports (tx-rateindex=2/tx-attempt-index=0),
>> driver assumes there were 10 attempts in total while in fact they were 3 when the
>> second rate is skipped. What direct effect this has on RC I can't grasp, but it
>> definitively falsifies statistics.
>>
>> Same goes for airtime: check how this falsifies its calculation in
>> ath_tx_count_airtime().
> 
> Ah, right, I was assuming that rates[1].count would be reset to zero
> somehow. Have you confirmed that the attempts actually go up on in the
> Minstrel stats for the skipped rate?
> 
>> Also, the above mentioned is an immediate visible issue: if RC
>> provides two rates e.g. (7/3)(5/3) of which the first is too high and
>> the second is not even attempted, frames don't make it through.
> 
> Yeah, rate control would likely take longer to converge to the right
> rate. I suppose if this is a hardware model-specific issue that a quirks
> bit could be added to instruct Minstrel to disregard the second index.
> But it does sound a bit odd; have you verified that it's consistent on
> different units of the same model (and not just a busted device)?
> 

False alarm.

We got confirmation that the observed failure with that exact chip revision is not
happening on a different platform. It still might be a HW issue specific to our
rarely used PPC platform, but it is not an ath9k malfunction. I'll dig further
into that and report back if it is relevant for the list.

Thanks Toke for the feedback and insights and sorry for noise.


Cheers,
Zefir




[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux