Here's my problem: I am trying to verify the formula : W/RTT = Max Throughput between two end-hosts belonging to the same private network. Where: RTT stands for Round Trip Time W = min(CWND, AW, SNDBUF) CWND : the size of the congestion window AW : the size of the receiver's advertized window SNDBUF : the size of the send buffer In order to do this I have made various bandwidth measurements between the two hosts. More particularly I have fixed the receiver's send buffer to 16 MBytes whereas I have made the sender's buffer vary between 8 KBytes and 16 MBytes. Both hosts, as it can be seen below, are good machines which are linked to the private network through Gigabit Ethernet. Hosts' configuration: -------------------------- Debian Linux 2.6.12-1-amd64-k8-smp AMD Opteron 246/248 2GB RAM 80 GB HDD Gigabit Ethernet Tcp specific options that are set via sysctl can be found at the end of this letter. As for the network itself it appears to be of excellent quality since during the whole experiments no retransmitted is reported and the RTT ranges between 12 and 13 milliseconds. Normally, one shouldn't expect to approach very closely W/RTT but given the quality of both network (no losses and very stable RTT) and end hosts it is surprising to get at best only 70 % of W/RTT (see below for results). Bandwidth is measured with Iperf tool Tcp buffer sizes are set with Iperf tool (via setsockopt() ) Traffic is dumped with tcpdump on both end hosts Traffic statistics from tcpdump traces are provided by tcptrace tool The tcpdump's traces which are made for each transfer confirm network quality. Here are some figures: RTT SNDBUF RCVBUF MAX SND MAX AW Iperf W/RTT % ---------------------------------------------------------------------------------------- 12,7 8 16384 8 6293248 3,57 5,16 69,18 12,7 16 16384 10,76 6293248 6,9 10,32 66,85 12,7 32 16384 21,4 6293248 13,7 20,64 66,37 12,7 64 16384 31,5 6293248 26,7 41,28 64,67 12,7 128 16384 49,5 6293248 54,4 82,56 65,88 12,7 256 16384 213 6293248 105 165,13 63,58 12,7 512 16384 266 6293248 171 330,26 51,77 12,7 1024 16384 - 6293248 382 660,52 57,83 12,7 2048 16384 - 6293248 673 1321,04 50,94 12,7 4096 16384 - 6293248 905 2642,08 34,25 RTT : round trip time (milliseconds) SNDBUF : size of tcp send buffer (KBytes) RCVBUF : size of tcp receive buffer (KBytes) MAX SND : the average amount of data send per RTT (KBytes) MAX SND is estimated from the tcpdump traces. MAX AW : maximum size of the advertized window (KBytes) provided by tcdump's traces Iperf : Throughput reported by Iperf tool (Mbits/sec) W/RTT : Max Throughput reachable (Mbits/sec) Even though only the maximum size of the advertized window is reported, actually the size of the advertized window grows in a few RTT greater than SNDBUF, thus I assumed safe to take W = min(CWND, SNDBUF) and since no retransmissions are detected, the CWND grows beyond the size of SNDBUF and so I took W =SNDBUF to compute W/RTT. As it can be seen, we hardly reach 70 % of the value predicted by the formula and apparently it seems that it is due to the fact that MAX SND remains relatively low compared to SNDBUF. Hereafter lie some questions. Questions: -------------- 1) Am I missing or misunderstanding something ? 2) Do you have any other ideas which could explain the low percentage reached ? 3) Supposing the low percentage is really due to the fact that sender's buffer isn't fully used, why isn't it used to its fullest ? Is there some way to overcome this ? Misc Questions: -------------------- i.e.: questions I tried to answer myself by searching around the internet but for which I didn't find any satisfactory answer or any answer at all. 4) Why is the advertized window steadily growing until it reaches 6 MBytes instead of being given directly a size of 6 Mbytes at the beginning of the connection ? 5) Why does the advertized window remain stuck at 6 MBytes ? 6) Why does the kernel allocate twice the size of the buffer size requested by setsockopt ? Thank you in advance, Constantinos ################### # /etc/sysctl.conf # ################### # I mainly disabled ecn, fack, dsack,autotuning # Left rfc1323 as well as sack enabled # Left only TCP Reno (i.e.: disabled bictcp, vegas, ...) net/ipv4/tcp_tso_win_divisor=8 net/ipv4/tcp_moderate_rcvbuf=0 net/ipv4/tcp_bic=0 net/ipv4/tcp_vegas_cong_avoid=0 net/ipv4/tcp_westwood=0 net/ipv4/tcp_no_metrics_save=0 net/ipv4/tcp_low_latency=0 net/ipv4/tcp_frto=0 net/ipv4/tcp_tw_reuse=0 net/ipv4/tcp_adv_win_scale=2 net/ipv4/tcp_app_win=31 net/ipv4/tcp_dsack=0 net/ipv4/tcp_ecn=0 net/ipv4/tcp_reordering=3 net/ipv4/tcp_fack=0 net/ipv4/tcp_orphan_retries=0 net/ipv4/tcp_max_syn_backlog=1024 net/ipv4/tcp_rfc1337=0 net/ipv4/tcp_stdurg=0 net/ipv4/tcp_abort_on_overflow=0 net/ipv4/tcp_tw_recycle=0 net/ipv4/tcp_syncookies=0 net/ipv4/tcp_fin_timeout=60 net/ipv4/tcp_retries2=15 net/ipv4/tcp_retries1=3 net/ipv4/tcp_keepalive_intvl=75 net/ipv4/tcp_keepalive_probes=9 net/ipv4/tcp_keepalive_time=7200 net/ipv4/tcp_max_tw_buckets=180000 net/ipv4/tcp_max_orphans=65536 net/ipv4/tcp_synack_retries=5 net/ipv4/tcp_syn_retries=5 net/ipv4/tcp_retrans_collapse=1 net/ipv4/tcp_sack=1 net/ipv4/tcp_window_scaling=1 net/ipv4/tcp_timestamps=1 net/core/rmem_default=8388608 net/core/rmem_max=8388608 net/core/wmem_default=8388608 net/core/wmem_max=8388608 net/ipv4/ip_no_pmtu_disc=0 - : send the line "unsubscribe linux-net" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html