We have observed bandwidth problems of TCP when running benchmarks which gradually increase its payload and having (most of its) data flowing unidirectional. The degradation is observed between two dual XEON/E7500 machines, using Gbe and a cross-over cable. The Gbe is eth1 and has an MTU of 9000. eth0 is a FE with an MTU of 1500. The machines is running linux 2.4.20, and the NIC in question is an Intel Corp. 82544GC Gigabit Ethernet Controller (rev 02).
During an attempt to change TCP parameter settings, and also systematically changing socket rx/tx buffer sizes, I discovered that once in a while, the benchmark ran well. The OK run was not deterministic, and had nothing to do with change of the parameter settings. Hence, I let the machine work during the week-end, and took tcpdumps which I saved for the successful run. An analysis of the "OK" run vs. the "BAD" run, discovers a couple of interesting things. The problem is that the advertised window of the receiver does not increase. The second problem, which also affects the "OK" run, is that the ratio of packets sent (from src to dst) and the number of packets received (#advertisements) is 1 for the "BAD" scenario (actually this is probably a consequence of the window being only ~2xMTU size). Surprisingly, it is as large as 0.5 for the "OK" scenario. IMHO, this violates RFC813, which states:
There are two reasons for prompt acknowledgement. One is to
prevent retransmission. We will discuss later how to determine whether
unnecessary retransmission is occurring. The other reason one
acknowledges promptly is to permit further data to be sent. However,
the previous section makes quite clear that it is not always desirable
to send a little bit of data, even though the receiver may have room for
it. Therefore, one can state a general rule that under normal
operation, the receiver of data need not, and for efficiency reasons
should not, acknowledge the data unless either the acknowledgement is
intended to produce an increased useable window, is necessary in order
to prevent retransmission or is being sent as part of a reverse
direction segment being sent for some other reason. We will consider an
algorithm to achieve these goals.
The two tcpdumps has been analyzed by a simple awk scripts, which gives information for every 1/10 of a second of the runtime:
time: elapsed time relative to the first packet
MB/s: sum of TCP payload (based on TCP sequence numbers) *1e-6 / delta time
avg_len: average packet length (TCP payload) sent
nsent: no of packets sent from src to dst
avg_win: average window size advertised by the src
adv/sent: ratio between #advertisements (sum of prompt acks and piggybacked acks) and #packets sent
The extract of the analysis is enclosed as "ok.txt" and "bad.txt". Also, the information is presented as graphs in "bw_winsiz.png" and "bw_adv-sent.png". In the graphs, the "OK" data uses the bottom x-axis, the "BAD" data uses the upper x-axis (the "BAD" run-time is longer due to the lower bandwidth). Bandwidth uses the left y-axes, whereas the average advertised window size and the ratio of the #advertisements and #packets sent uses the rightmost x-axis.
I do hope this information is useful 4u and that the problem can be fixed. Since I do not read the mailing lists, I would appreciate a note back by email of someone triggers on this.
Cheers, Håkon
--
Håkon Bugge; VP Product Development; Scali AS;
mailto:hob@scali.no; http://www.scali.com; fax: +47 22 62 89 51;
Voice: +47 22 62 89 50; Cellular (Europe+US): +47 924 84 514;
Visiting Addr: Olaf Helsets vei 6, Bogerud, N-0621 Oslo, Norway;
Mail Addr: Scali AS, Postboks 150, Oppsal, N-0619 Oslo, Norway;
Attachment:
bw_adv-sent.png
Description: PNG image
# time MB/s avg_len nsent avg_win adv/sent 0.0 1.3 145 924 17896.0 0.3 0.1 8.0 144 5551 17775.0 0.3 0.2 12.9 272 4748 17688.0 0.3 0.3 13.3 272 4887 17688.0 0.3 0.4 15.3 272 5616 17688.0 0.1 0.5 15.3 272 5632 17688.0 0.1 0.6 15.2 272 5603 17688.0 0.1 0.7 15.3 272 5632 17688.0 0.1 0.8 15.0 272 5509 17688.0 0.1 0.9 15.1 272 5567 17688.0 0.1 1.0 15.4 272 5675 17688.0 0.1 1.1 15.4 272 5647 17688.0 0.1 1.2 15.4 272 5667 17688.0 0.1 1.3 15.3 272 5634 17688.0 0.1 1.4 15.4 272 5668 17688.0 0.1 1.5 15.2 272 5605 17688.0 0.1 1.6 15.2 272 5593 17688.0 0.1 1.7 15.4 272 5645 17688.0 0.1 1.8 15.4 272 5655 17688.0 0.1 1.9 15.2 272 5576 17688.0 0.1 2.0 18.1 340 5309 17688.0 0.1 2.1 20.3 400 5078 17688.0 0.2 2.2 20.2 399 5056 17688.0 0.2 2.3 20.2 413 4891 17688.0 0.2 2.4 21.5 579 3718 17688.0 0.5 2.5 23.4 656 3560 17688.0 0.5 2.6 23.9 664 3595 17688.0 0.5 2.7 25.1 784 3206 17688.0 0.6 2.8 25.4 784 3237 17688.0 0.7 2.9 27.1 880 3083 17688.0 0.7 3.0 32.7 1043 3137 17688.0 0.6 3.1 35.2 1296 2716 17688.0 0.7 3.2 36.7 959 3828 17688.0 0.5 3.3 42.7 852 5012 17688.0 0.3 3.4 63.7 1116 5707 17688.0 0.1 3.5 72.0 2792 2579 17688.0 0.4 3.6 68.2 2951 2309 17688.0 0.4 3.7 57.4 8800 652 17688.0 1.2 3.8 57.9 7706 751 17688.0 1.0 3.9 58.3 7165 814 17688.0 0.9 4.0 57.9 7149 810 17688.0 0.9 4.1 55.9 7459 750 17688.0 0.9 4.2 56.1 7495 749 17688.0 0.9 4.3 55.5 7736 717 17688.0 0.9 4.4 55.6 8150 682 17688.0 1.0 4.5 55.5 8124 683 17688.0 1.0 4.6 56.0 8544 655 17688.0 1.0 4.7 56.0 8524 657 17688.0 1.0 4.8 54.2 8840 613 17688.0 1.0 4.9 53.9 8923 604 17688.0 1.0 5.0 54.0 8905 606 17688.0 1.0 5.1 54.3 8948 607 17688.0 1.0 5.2 54.1 8897 608 17688.0 1.0 5.3 54.0 8948 604 17688.0 1.0 5.4 53.8 8927 603 17688.0 1.0 5.5 54.2 8948 606 17688.0 1.0 5.6 54.3 8948 607 17688.0 1.0 5.7 52.6 8892 592 17688.0 1.0 5.8 50.0 8948 559 17688.0 1.0 5.9 51.9 8910 582 17688.0 1.0 6.0 51.1 8891 575 17688.0 1.0 6.1 50.9 8934 570 17688.0 1.0 6.2 50.0 8883 563 17688.0 1.0 6.3 50.4 8884 567 17688.0 1.0 6.4 50.1 8912 562 17688.0 1.0 6.5 52.1 8887 586 17688.0 1.0 6.6 50.5 8879 569 17688.0 1.0 6.7 51.4 8909 577 17688.0 1.0 6.8 50.7 8903 569 17688.0 1.0 6.9 52.0 8889 585 17688.0 1.0 7.0 50.1 8897 563 17688.0 1.0 7.1 52.2 8917 585 17688.0 1.0 7.2 52.4 8914 588 17688.0 1.0 7.3 52.3 8937 585 17688.0 1.0 7.4 49.2 8905 552 17688.0 1.0 7.5 52.4 8918 587 17688.0 1.0 7.6 52.5 8911 589 17688.0 1.0 7.7 47.7 8916 535 17688.0 1.0 7.8 52.5 8936 588 17688.0 1.0 7.9 52.4 8934 587 17688.0 1.0 8.0 52.2 8926 585 17688.0 1.0 8.1 52.3 8929 586 17688.0 1.0 8.2 52.7 8941 589 17688.0 1.0 8.3 46.2 8927 517 17688.0 1.0 8.4 52.4 8918 587 17688.0 1.0 8.5 52.4 8933 587 17688.0 1.0 8.6 52.5 8933 588 17688.0 1.0 8.7 52.4 8933 587 17688.0 1.0 8.8 52.4 8918 588 17688.0 1.0 8.9 52.3 8933 586 17688.0 1.0 9.0 43.8 8930 490 17688.0 1.0 9.1 52.3 8940 585 17688.0 1.0 9.2 52.6 8948 588 17688.0 1.0 9.3 52.4 8925 587 17688.0 1.0 9.4 52.5 8940 587 17688.0 1.0 9.5 52.2 8925 585 17688.0 1.0 9.6 52.5 8940 587 17688.0 1.0 9.7 52.5 8925 588 17688.0 1.0 9.8 52.5 8948 587 17688.0 1.0 9.9 52.4 8940 586 17688.0 1.0 10.0 52.7 8940 589 17688.0 1.0 10.1 40.5 8919 454 17688.0 1.0 10.2 52.7 8948 589 17688.0 1.0 10.3 52.2 8932 584 17688.0 1.0 10.4 52.5 8933 588 17688.0 1.0 10.5 52.3 8948 585 17688.0 1.0 10.6 52.5 8933 588 17688.0 1.0 10.7 52.6 8948 588 17688.0 1.0 10.8 50.7 8932 568 17688.0 1.0 10.9 51.8 8932 580 17688.0 1.0 11.0 52.3 8948 585 17688.0 1.0 11.1 52.3 8933 586 17688.0 1.0 11.2 52.5 8933 588 17688.0 1.0 11.3 52.3 8948 585 17688.0 1.0 11.4 52.3 8932 586 17688.0 1.0 11.5 52.3 8948 585 17688.0 1.0 11.6 35.2 8926 394 17688.0 1.0 11.7 52.5 8948 587 17688.0 1.0 11.8 52.4 8932 587 17688.0 1.0 11.9 52.4 8948 586 17688.0 1.0 12.0 52.7 8948 589 17688.0 1.0 12.1 52.2 8933 584 17688.0 1.0 12.2 52.8 8948 590 17688.0 1.0 12.3 52.2 8932 584 17688.0 1.0 12.4 52.5 8948 587 17688.0 1.0 12.5 52.3 8948 585 17688.0 1.0 12.6 52.3 8933 586 17688.0 1.0 12.7 52.3 8948 585 17688.0 1.0 12.8 52.3 8932 586 17688.0 1.0 12.9 52.4 8948 586 17688.0 1.0 13.0 52.3 8933 585 17688.0 1.0 13.1 52.3 8948 585 17688.0 1.0 13.2 52.6 8948 588 17688.0 1.0 13.3 52.4 8932 587 17688.0 1.0 13.4 52.5 8948 587 17688.0 1.0 13.5 52.3 8932 585 17688.0 1.0 13.6 52.5 8948 587 17688.0 1.0 13.7 52.4 8948 586 17688.0 1.0 13.8 29.5 8868 333 17688.0 1.0 13.9 53.2 8948 594 17688.0 1.0 14.0 53.3 8948 596 17688.0 1.0 14.1 53.1 8947 593 17688.0 1.0 14.2 52.6 8948 588 17688.0 1.0 14.3 52.3 8948 584 17688.0 1.0 14.4 52.5 8933 588 17688.0 1.0 14.5 52.5 8948 587 17688.0 1.0 14.6 52.5 8948 587 17688.0 1.0 14.7 52.3 8948 584 17688.0 1.0 14.8 52.5 8947 587 17688.0 1.0 14.9 52.3 8948 584 17688.0 1.0 15.0 52.5 8948 587 17688.0 1.0 15.1 52.3 8933 586 17688.0 1.0 15.2 52.4 8948 586 17688.0 1.0 15.3 52.3 8948 585 17688.0 1.0 15.4 52.4 8947 586 17688.0 1.0 15.5 52.2 8948 583 17688.0 1.0 15.6 52.5 8948 587 17688.0 1.0 15.7 52.1 8933 583 17688.0 1.0 15.8 51.2 8948 572 17688.0 1.0 15.9 51.5 8948 576 17688.0 1.0 16.0 52.3 8947 585 17688.0 1.0 16.1 52.4 8948 586 17688.0 1.0 16.2 52.7 8948 589 17688.0 1.0 16.3 52.3 8948 585 17688.0 1.0 16.4 52.5 8947 587 17688.0 1.0 16.5 52.5 8948 587 17688.0 1.0 16.6 52.7 8948 589 17688.0 1.0 16.7 14.3 8767 163 17688.0 1.0 # tot sent: 249059 , tot_recvd: 118683, tot_others: 2
Attachment:
bw_winsiz.png
Description: PNG image
# time MB/s avg_len nsent avg_win adv/sent 0.0 1.4 144 961 17896.0 0.3 0.1 8.1 144 5620 17896.0 0.1 0.2 14.2 271 5235 17896.0 0.3 0.3 14.0 272 5156 17896.0 0.3 0.4 14.1 272 5178 17896.0 0.2 0.5 14.0 272 5145 17896.0 0.3 0.6 14.1 272 5191 17896.0 0.2 0.7 14.0 272 5131 17896.0 0.3 0.8 13.9 272 5109 17896.0 0.2 0.9 14.0 272 5158 17896.0 0.2 1.0 14.2 272 5204 17896.0 0.2 1.1 15.1 272 5532 20743.2 0.1 1.2 15.3 272 5633 21760.0 0.1 1.3 15.2 272 5575 21760.0 0.1 1.4 15.3 272 5638 21760.0 0.1 1.5 15.2 272 5606 21760.0 0.1 1.6 15.2 272 5599 21760.0 0.1 1.7 15.4 272 5654 21760.0 0.1 1.8 15.3 272 5618 21760.0 0.1 1.9 15.3 276 5537 21760.0 0.1 2.0 20.0 400 4988 21760.0 0.2 2.1 19.8 399 4956 21760.0 0.2 2.2 20.0 429 4669 22383.7 0.3 2.3 20.9 528 3957 23232.0 0.5 2.4 21.0 528 3978 23232.0 0.5 2.5 23.8 646 3673 23232.0 0.6 2.6 24.0 656 3664 23232.0 0.5 2.7 22.7 741 3065 23232.0 0.6 2.8 25.0 784 3191 23232.0 0.7 2.9 25.3 803 3144 23232.0 0.6 3.0 32.4 1040 3117 23232.0 0.6 3.1 33.9 1165 2912 23232.0 0.7 3.2 35.1 1296 2707 23232.0 0.7 3.3 39.6 876 4516 25637.6 0.3 3.4 52.0 1042 4985 26384.0 0.3 3.5 56.3 1135 4960 26384.0 0.3 3.6 78.6 1648 4768 26384.0 0.3 3.7 83.6 1768 4731 33811.2 0.4 3.8 99.1 2165 4576 49351.4 0.4 3.9 121.9 5432 2244 50432.0 0.5 4.0 123.7 8617 1435 50432.0 0.6 4.1 121.4 8499 1428 50432.0 0.6 4.2 117.5 8267 1421 50432.0 0.5 4.3 116.3 8155 1426 50432.0 0.5 4.4 123.4 8909 1385 50432.0 0.5 4.5 122.5 8895 1377 50432.0 0.5 4.6 122.5 8895 1377 50432.0 0.5 4.7 123.8 8948 1383 50432.0 0.5 4.8 122.2 8941 1367 50432.0 0.5 4.9 123.8 8948 1383 50432.0 0.5 5.0 123.7 8948 1382 50432.0 0.5 5.1 121.7 8925 1363 50432.0 0.5 5.2 123.8 8948 1383 50432.0 0.5 5.3 121.0 8929 1355 50432.0 0.5 5.4 123.8 8948 1383 50432.0 0.5 5.5 120.4 8925 1349 50432.0 0.5 5.6 118.2 8909 1327 50432.0 0.5 5.7 113.5 8913 1273 50432.0 0.5 5.8 114.7 8927 1285 50432.0 0.5 5.9 112.9 8886 1271 50432.0 0.5 6.0 113.3 8888 1275 50432.0 0.5 6.1 115.5 8938 1292 50432.0 0.5 6.2 118.4 8886 1332 50432.0 0.5 6.3 114.2 8901 1283 50432.0 0.5 6.4 72.2 8899 811 50432.0 0.5 6.5 113.5 8938 1270 50432.0 0.5 6.6 119.7 8909 1344 50432.0 0.5 6.7 121.5 8926 1361 50432.0 0.5 6.8 109.8 8930 1229 50432.0 0.5 6.9 82.4 8903 926 50432.0 0.5 7.0 121.8 8940 1362 50432.0 0.5 7.1 107.6 8928 1205 50432.0 0.5 7.2 122.2 8929 1369 50432.0 0.5 7.3 122.0 8929 1366 50432.0 0.5 7.4 102.1 8936 1142 50432.0 0.5 7.5 122.1 8935 1366 50432.0 0.5 7.6 122.0 8935 1365 50432.0 0.5 7.7 116.7 8938 1306 50432.0 0.5 7.8 122.6 8941 1371 50432.0 0.5 7.9 92.8 8927 1040 50432.0 0.5 8.0 122.2 8941 1367 50432.0 0.5 8.1 122.7 8941 1372 50432.0 0.5 8.2 120.0 8935 1343 50432.0 0.5 8.3 122.5 8941 1370 50432.0 0.5 8.4 122.9 8935 1375 50432.0 0.5 8.5 82.4 8929 923 50432.0 0.5 8.6 123.2 8941 1378 50432.0 0.5 8.7 122.1 8941 1366 50432.0 0.5 8.8 122.9 8941 1374 50432.0 0.5 8.9 122.0 8941 1364 50432.0 0.5 9.0 123.3 8941 1379 50432.0 0.5 9.1 122.6 8941 1371 50432.0 0.5 9.2 123.1 8941 1377 50432.0 0.5 9.3 123.2 8941 1378 50432.0 0.5 9.4 121.8 8946 1361 50432.0 0.5 9.5 70.3 8917 788 50432.0 0.5 9.6 122.9 8947 1374 50432.0 0.5 9.7 123.2 8948 1377 50432.0 0.5 9.8 122.5 8941 1370 50432.0 0.5 9.9 122.9 8947 1373 50432.0 0.5 10.0 122.5 8941 1370 50432.0 0.5 10.1 123.5 8948 1380 50432.0 0.5 10.2 123.2 8947 1377 50432.0 0.5 10.3 122.5 8941 1370 50432.0 0.5 10.4 123.3 8947 1378 50432.0 0.5 10.5 123.3 8948 1378 50432.0 0.5 10.6 123.2 8947 1377 50432.0 0.5 10.7 86.6 8945 968 50432.0 0.5 10.8 0.0 0 3 50432.0 0.3 # tot sent: 271630 , tot_recvd: 100643, tot_others: 0