Among many route options we support initrwnd/RTAX_INITRWND path attribute: $ ip route change local 127.0.0.0/8 dev lo initrwnd 1024 This sets the initial receive window size (in packets). However, it's not very useful in practice. For smaller buffers (<128KiB) it can be used to bring the initial receive window down, but it's hard to imagine when this is useful. The same effect can be achieved with TCP_WINDOW_CLAMP / RTAX_WINDOW option. For larger buffers (>128KiB) the initial receive window is usually limited by rcv_ssthresh, which starts at 64KiB. The initrwnd option can't bring the window above it, which limits its usefulness This patch changes that. Now, by setting RTAX_INITRWND path attribute we bring up the initial rcv_ssthresh in line with the initrwnd value. This allows to increase the initial advertised receive window instantly, after first TCP RTT, above 64KiB. With this change, the administrator can configure a route (or skops ebpf program) where the receive window is opened much faster than usual. This is useful on big BDP connections - large latency, high throughput - where it takes much time to fully open the receive window, due to the usual rcv_ssthresh cap. However, this feature should be used with caution. It only makes sense to employ it in limited circumstances: * When using high-bandwidth TCP transfers over big-latency links. * When the truesize of the flow/NIC is sensible and predictable. * When the application is ready to send a lot of data immediately after flow is established. * When the sender has configured larger than usual `initcwnd`. * When optimizing for every possible RTT. This patch is related to previous work by Ivan Babrou: https://lore.kernel.org/bpf/CAA93jw5+LjKLcCaNr5wJGPrXhbjvLhts8hqpKPFx7JeWG4g0AA@xxxxxxxxxxxxxx/T/ Please note that due to TCP wscale semantics, the TCP sender will need to receive first ACK to be informed of the large opened receive window. That is: the large window is advertised only in the first ACK from the peer. When the TCP client has large window, it is advertised in the third-packet (ACK) of the handshake. When the TCP sever has large window, it is advertised only in the first ACK after some data has been received. Syncookie support will be provided in subsequent patchet, since it requires more changes. *** BLURB HERE *** Marek Majkowski (2): RTAX_INITRWND should be able to set the rcv_ssthresh above 64KiB Tests for RTAX_INITRWND include/linux/tcp.h | 1 + net/ipv4/tcp_minisocks.c | 9 +- net/ipv4/tcp_output.c | 7 +- .../selftests/bpf/prog_tests/tcp_initrwnd.c | 420 ++++++++++++++++++ .../selftests/bpf/progs/test_tcp_initrwnd.c | 30 ++ 5 files changed, 463 insertions(+), 4 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/tcp_initrwnd.c create mode 100644 tools/testing/selftests/bpf/progs/test_tcp_initrwnd.c -- 2.25.1