Johannes Berg <johannes@xxxxxxxxxxxxxxxx> writes: > On Mon, 2018-09-03 at 13:11 +0200, Toke Høiland-Jørgensen wrote: > >> > 6 vs. 8, I think? But I didn't follow the full discussion. > > Err, I just realized that I was completely wrong - the default, of > course, is 10. So smaller values mean more buffering. > > Most of my argumentation was thus utter garbage :-) Well, I got the gist, even if the sign bit was wrong ;) >> > I also think that we shouldn't necessarily restrict this to "for the >> > ath10k". Is there any reason to think that this would be different for >> > different chips? >> >> No, I also think it should be possible to select a value that will work >> for all drivers. There's a strong diminishing returns effect here after >> a certain point, and I believe we should strongly push back against just >> adding more buffering to chase (single-flow) throughput numbers; and I'm >> not convinced that having different values for different drivers is >> worth the complexity. > > I think I can see some point in it if the driver requires some > buffering for some specific reason? But you'd have to be able to state > that reason, I guess, I could imagine having a firmware limitation to > need to build two aggregates, or something around MU-MIMO, etc. Right, I'm not ruling out that there can be legitimate reasons to add extra buffering; but a lot of times it's just used to paper over other issues, so a good explanation is definitely needed... >> As far as the actual value, I *think* it may be that the default shift >> should be 7 (8 ms) rather than 8 (4 ms) as it currently is. Going back >> and looking at my data from when I submitted the original patch, it >> looks like the point of diminishing returns is somewhere between those >> two with ath9k (where I did most of my testing), and it seems reasonable >> that it could be slightly higher (i.e., more buffering) for ath10k. > > Grant's data shows a significant difference between 6 and 7 for both > latency and throughput: > > * median tpt > - ~241 vs ~201 (both 1 and 5 streams) > * median latency > - 7.5 vs 6 (1 stream) > - 17.3 vs. 16.6 (5 streams) > > A 20% throughput improvement at <= 1.5ms latency cost seems like a > pretty reasonable trade-off? Yeah, on it's face. What I'm bothered about is that it is the exact opposite results that I got from my ath10k tests (there, throughput *dropped* and latency doubled when going to from 4 to 16 ms of buffering). And, well, Grant's data is from a single test in a noisy environment where the timeseries graph shows that throughput is all over the place for the duration of the test; so it's hard to draw solid conclusions from (for instance, for the 5-stream test, the average throughput for 6 is 331 and 379 Mbps for the two repetitions, and for 7 it's 326 and 371 Mbps) . Unfortunately I don't have the same hardware used in this test, so I can't go verify it myself; so the only thing I can do is grumble about it here... :) -Toke