Re: W/RTT verification, linux tcp buffers behaviour

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 12 May 2006 10:18:13 +0200
"Constantinos Makassikis" <cmakassikis@xxxxxxxxx> wrote:

> Here's my problem:
> 
> I am trying to verify the formula :
> 
> W/RTT = Max Throughput
> 
> between two end-hosts belonging to the same private network.
> 
> Where:
> 
> RTT stands for Round Trip Time
> W = min(CWND, AW, SNDBUF)
> CWND : the size of the congestion window
> AW : the size of the receiver's advertized window
> SNDBUF : the size of the send buffer
> 
> In order to do this I have made various bandwidth measurements between the
> two hosts. More particularly I have fixed the receiver's send buffer
> to 16 MBytes whereas I have made the sender's buffer vary between 8 KBytes
> and 16 MBytes.
> 
> Both hosts, as it can be seen below, are good machines which are linked
> to the private network through Gigabit Ethernet.
> 
> Hosts' configuration:
> --------------------------
> 
> Debian Linux 2.6.12-1-amd64-k8-smp
> AMD Opteron 246/248
> 2GB RAM
> 80 GB HDD
> Gigabit Ethernet
> Tcp specific options that are set via sysctl can be found at the end
> of this letter.
> 
> 
> As for the network itself it appears to be of excellent quality since
> during the whole experiments no retransmitted is reported and the RTT
> ranges between 12 and 13 milliseconds.
> 
> Normally, one shouldn't expect to approach very closely W/RTT but given
> the quality of both network (no losses and very stable RTT) and end hosts
> it is surprising to get at best only 70 % of W/RTT (see below for results).
> 
> Bandwidth is measured with Iperf tool
> Tcp buffer sizes are set with Iperf tool (via setsockopt() )
> Traffic is dumped with tcpdump on both end hosts
> Traffic statistics from tcpdump traces are provided by tcptrace tool
> 
> The tcpdump's traces which are made for each transfer confirm network quality.
> 
> Here are some figures:
> 
> RTT    SNDBUF  RCVBUF MAX SND  MAX AW      Iperf   W/RTT       %
> ----------------------------------------------------------------------------------------
> 12,7   8             16384    8               6293248     3,57    5,16
>         69,18
> 12,7   16           16384    10,76        6293248     6,9      10,32       66,85
> 12,7   32           16384    21,4          6293248     13,7    20,64       66,37
> 12,7   64           16384    31,5          6293248     26,7    41,28       64,67
> 12,7   128         16384    49,5          6293248     54,4    82,56       65,88
> 12,7   256         16384    213           6293248     105     165,13     63,58
> 12,7   512         16384    266           6293248     171     330,26     51,77
> 12,7   1024       16384    -                6293248     382     660,52     57,83
> 12,7   2048       16384    -                6293248     673     1321,04   50,94
> 12,7   4096       16384    -                6293248     905     2642,08   34,25
> 
> RTT            : round trip time
>                   (milliseconds)
> SNDBUF   : size of tcp send buffer
>          (KBytes)
> RCVBUF    : size of tcp receive buffer
>          (KBytes)
> MAX SND : the average amount of data send per RTT               (KBytes)
>                    MAX SND is estimated from the tcpdump traces.
> MAX AW    : maximum size of the advertized window                 (KBytes)
>                    provided by tcdump's traces
> Iperf          : Throughput reported by Iperf tool
>        (Mbits/sec)
> W/RTT        : Max Throughput reachable
>       (Mbits/sec)
> 
> Even though only the maximum size of the advertized window is reported,
> actually the size of the advertized window grows in a few RTT greater
> than SNDBUF, thus I assumed safe to take W = min(CWND, SNDBUF) and since no
> retransmissions are detected, the CWND grows beyond the size of SNDBUF and so
> I took W =SNDBUF to compute W/RTT.
> 
> As it can be seen, we hardly reach 70 % of the value predicted by the formula
> and apparently it seems that it is due to the fact that MAX SND
> remains relatively
> low compared to SNDBUF.
> 
> Hereafter lie some questions.
> 
> Questions:
> --------------
> 
> 1) Am I missing or misunderstanding something ?

Linux does autotuning of send and receive buffer size.

> 2) Do you have any other ideas which could explain the low percentage reached ?

Max is limited by tcp_rmem/tcp_wmem, read Documenation/networking/ip-sysctl.txt

> 3) Supposing the low percentage is really due to the fact that
> sender's buffer isn't
>    fully used, why isn't it used to its fullest ?
>    Is there some way to overcome this ?
> 
> Misc Questions:
> --------------------
> 
> i.e.: questions I tried to answer myself by searching around the
> internet but for which I didn't find any satisfactory answer or any
> answer at all.
> 
> 4) Why is the advertized window steadily growing until it reaches 6
> MBytes instead of being given directly a size of 6 Mbytes at the
> beginning of the connection ?

Slow start and autotuning.

> 5) Why does the advertized window remain stuck at 6 MBytes ?
tcp_wmem

> 6) Why does the kernel allocate twice the size of the buffer size
> requested by setsockopt ?
> 
> Thank you in advance,
> 
> Constantinos
> 
> 
> ###################
> #           /etc/sysctl.conf            #
> ###################
>
# increase Linux autotuning TCP buffer limits to 64M
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864
-
: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux