On Sun, Nov 24, 2024 at 3:21 AM Jakub Kicinski <kuba@xxxxxxxxxx> wrote: > > Recent kernels cause a lot of TCP retransmissions > > [ ID] Interval Transfer Bitrate Retr Cwnd > [ 5] 0.00-1.00 sec 2.24 GBytes 19.2 Gbits/sec 2767 442 KBytes > [ 5] 1.00-2.00 sec 2.23 GBytes 19.1 Gbits/sec 2312 350 KBytes > ^^^^ > > Replacing the qdisc with pfifo makes retransmissions go away. > > It appears that a flow may have a delayed packet with a very near > Tx time. Later, we may get busy processing Rx and the target Tx time > will pass, but we won't service Tx since the CPU is busy with Rx. > If Rx sees an ACK and we try to push more data for the delayed flow > we may fastpath the skb, not realizing that there are already "ready > to send" packets for this flow sitting in the qdisc. > > Don't trust the fastpath if we are "behind" according to the projected > Tx time for next flow waiting in the Qdisc. Because we consider anything > within the offload window to be okay for fastpath we must consider > the entire offload window as "now". > > Qdisc config: > > qdisc fq 8001: dev eth0 parent 1234:1 limit 10000p flow_limit 100p \ > buckets 32768 orphan_mask 1023 bands 3 \ > priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 \ > weights 589824 196608 65536 quantum 3028b initial_quantum 15140b \ > low_rate_threshold 550Kbit \ > refill_delay 40ms timer_slack 10us horizon 10s horizon_drop > > For iperf this change seems to do fine, the reordering is gone. > The fastpath still gets used most of the time: > > gc 0 highprio 0 fastpath 142614 throttled 418309 latency 19.1us > xx_behind 2731 > > where "xx_behind" counts how many times we hit the new "return false". > > CC: stable@xxxxxxxxxxxxxxx > Fixes: 076433bd78d7 ("net_sched: sch_fq: add fast path for mostly idle qdisc") > Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx> Reviewed-by: Eric Dumazet <edumazet@xxxxxxxxxx> Thanks !