On 02/02/2018 07:11 AM, Toke Høiland-Jørgensen wrote: > Since we now have the convenient helper to do so, actually adjust the > TSQ pacing shift for packets going out over a WiFi interface. This > significantly improves throughput for locally-originated TCP > connections. The default pacing shift of 10 corresponds to ~1ms of > queued packet data. Adjusting this to a shift of 8 (i.e. ~4ms) improves > 1-hop throughput for ath9k by a factor of 3, whereas increasing it more > has diminishing returns. > > Achieved throughput for different values of sk_pacing_shift (average of > 5 iterations of 10-sec netperf runs to a host on the other side of the > WiFi hop): > > sk_pacing_shift 10: 43.21 Mbps (pre-patch) > sk_pacing_shift 9: 78.17 Mbps > sk_pacing_shift 8: 123.94 Mbps > sk_pacing_shift 7: 128.31 Mbps > > Latency for competing flows increases from ~3 ms to ~10 ms with this > change. This is about the same magnitude of queueing latency induced by > flows that are not originated on the WiFi device itself (and so are not > limited by TSQ). > > Signed-off-by: Toke Høiland-Jørgensen <toke@xxxxxxx> > --- > net/mac80211/tx.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c > index 25904af38839..69722504e3e1 100644 > --- a/net/mac80211/tx.c > +++ b/net/mac80211/tx.c > @@ -3574,6 +3574,14 @@ void __ieee80211_subif_start_xmit(struct sk_buff *skb, > if (!IS_ERR_OR_NULL(sta)) { > struct ieee80211_fast_tx *fast_tx; > > + /* We need a bit of data queued to build aggregates properly, so > + * instruct the TCP stack to allow more than a single ms of data > + * to be queued in the stack. The value is a bit-shift of 1 > + * second, so 8 is ~4ms of queued data. Only affects local TCP > + * sockets. > + */ > + sk_pacing_shift_update(skb->sk, 8); > + > fast_tx = rcu_dereference(sta->fast_tx); > > if (fast_tx && I knew increasing the value doesn't help much after 8 for ath9k, but I ran a testing on ath10k that 6 or 7 is having optimal number. Since ath10k/11ac device has higher bandwidth than ath9k/11n, can we consider to use to 6 or 7 to accommodate that effect? tx (mbps) cpu usage (%) 5 404 28.5 6 398 13.8 7 401 8 8 378 5 9 230 4.5 10 79.6 2 I have a quad core machine. $ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Core(TM) i5-3380M CPU @ 2.90GHz -- Ryan Hsu