Re: [PATCH v2 2/2] ath10k: Set sk_pacing_shift to 6 for 11AC WiFi chips

Johannes Berg <johannes@xxxxxxxxxxxxxxxx> · Mon, 03 Sep 2018 13:47:20 +0200

On Mon, 2018-09-03 at 13:11 +0200, Toke Høiland-Jørgensen wrote:

> > 6 vs. 8, I think? But I didn't follow the full discussion.

Err, I just realized that I was completely wrong - the default, of
course, is 10. So smaller values mean more buffering.

Most of my argumentation was thus utter garbage :-)

> > I also think that we shouldn't necessarily restrict this to "for the
> > ath10k". Is there any reason to think that this would be different for
> > different chips?
> 
> No, I also think it should be possible to select a value that will work
> for all drivers. There's a strong diminishing returns effect here after
> a certain point, and I believe we should strongly push back against just
> adding more buffering to chase (single-flow) throughput numbers; and I'm
> not convinced that having different values for different drivers is
> worth the complexity.

I think I can see some point in it if the driver requires some buffering
for some specific reason? But you'd have to be able to state that
reason, I guess, I could imagine having a firmware limitation to need to
build two aggregates, or something around MU-MIMO, etc.

> As far as I'm concerned, just showing an increase in maximum throughput
> under ideal conditions (as this initial patch submission did) is not
> sufficient argument for adding more buffering. At a minimum, the
> evaluation should also include:
> 
> - Impact on latency and throughput for competing flows in the same test.
> - Impact on latency for the TCP flow itself.
> - A full evaluation at low rates and in scenarios with more than one
>   client.

That seems reasonable.

> As far as the actual value, I *think* it may be that the default shift
> should be 7 (8 ms) rather than 8 (4 ms) as it currently is. Going back
> and looking at my data from when I submitted the original patch, it
> looks like the point of diminishing returns is somewhere between those
> two with ath9k (where I did most of my testing), and it seems reasonable
> that it could be slightly higher (i.e., more buffering) for ath10k.

Grant's data shows a significant difference between 6 and 7 for both
latency and throughput:

 * median tpt
   - ~241 vs ~201 (both 1 and 5 streams)
 * median latency
   - 7.5 vs 6 (1 stream)
   - 17.3 vs. 16.6 (5 streams)

A 20% throughput improvement at <= 1.5ms latency cost seems like a
pretty reasonable trade-off?

johannes