Re: [PATCH v2 2/2] ath10k: Set sk_pacing_shift to 6 for 11AC WiFi chips

Toke Høiland-Jørgensen <toke@xxxxxxx> · Mon, 03 Sep 2018 15:35:11 +0200

Johannes Berg <johannes@xxxxxxxxxxxxxxxx> writes:

> On Mon, 2018-09-03 at 13:11 +0200, Toke Høiland-Jørgensen wrote:
>
>> > 6 vs. 8, I think? But I didn't follow the full discussion.
>
> Err, I just realized that I was completely wrong - the default, of
> course, is 10. So smaller values mean more buffering.
>
> Most of my argumentation was thus utter garbage :-)

Well, I got the gist, even if the sign bit was wrong ;)

>> > I also think that we shouldn't necessarily restrict this to "for the
>> > ath10k". Is there any reason to think that this would be different for
>> > different chips?
>> 
>> No, I also think it should be possible to select a value that will work
>> for all drivers. There's a strong diminishing returns effect here after
>> a certain point, and I believe we should strongly push back against just
>> adding more buffering to chase (single-flow) throughput numbers; and I'm
>> not convinced that having different values for different drivers is
>> worth the complexity.
>
> I think I can see some point in it if the driver requires some
> buffering for some specific reason? But you'd have to be able to state
> that reason, I guess, I could imagine having a firmware limitation to
> need to build two aggregates, or something around MU-MIMO, etc.

Right, I'm not ruling out that there can be legitimate reasons to add
extra buffering; but a lot of times it's just used to paper over other
issues, so a good explanation is definitely needed...

>> As far as the actual value, I *think* it may be that the default shift
>> should be 7 (8 ms) rather than 8 (4 ms) as it currently is. Going back
>> and looking at my data from when I submitted the original patch, it
>> looks like the point of diminishing returns is somewhere between those
>> two with ath9k (where I did most of my testing), and it seems reasonable
>> that it could be slightly higher (i.e., more buffering) for ath10k.
>
> Grant's data shows a significant difference between 6 and 7 for both
> latency and throughput:
>
>  * median tpt
>    - ~241 vs ~201 (both 1 and 5 streams)
>  * median latency
>    - 7.5 vs 6 (1 stream)
>    - 17.3 vs. 16.6 (5 streams)
>
> A 20% throughput improvement at <= 1.5ms latency cost seems like a
> pretty reasonable trade-off?

Yeah, on it's face. What I'm bothered about is that it is the exact
opposite results that I got from my ath10k tests (there, throughput
*dropped* and latency doubled when going to from 4 to 16 ms of
buffering). And, well, Grant's data is from a single test in a noisy
environment where the timeseries graph shows that throughput is all over
the place for the duration of the test; so it's hard to draw solid
conclusions from (for instance, for the 5-stream test, the average
throughput for 6 is 331 and 379 Mbps for the two repetitions, and for 7
it's 326 and 371 Mbps) . Unfortunately I don't have the same hardware
used in this test, so I can't go verify it myself; so the only thing I
can do is grumble about it here... :)

-Toke