Jonathon Ross wrote:
Summary:
Linux's TCP congestion control looks like it uses packets outstanding instead of segments outstanding for control (unlike BSD). This is a big problem for us because we use lots of small packets for latency. Also when the congestion control kicks in, it looks like the max number of outstanding packets linux allows is 1, which excavates my problem.
Why I care:
This isn't an internet application, so don't flame me about lots of small packets w/ TCP_NODELAY on. Customers pay us & telco costs for extremely low latency finical market data, and the ability to place and cancel orders in under 10ms. Because of the number of black-box driven trading system today, markets move very quickly and a 30ms delay because of a TCP ACK is very bad.
Examples:
from the 2.4.20 kernel--
Server -> sends data Client -> sends ack Server -> sends data Client -> sends ack Server -> sends data Client -> sends data/heartbeat/order/etc. and piggybacked ack (At this point linux considers the connection bi-directional and kicks off the delayed ack timer) Server -> sends data Server -> sends data (No Ack Received, congestion window considered full with 2 outstanding packets, server stack will send no data) Client -> Delayed ack timer expires, or client sends data+ack Server resumes.
which you can see in this sniff:
001934 10.50.6.158.9090 > 10.50.6.53.34990: P 11639:11662(23) ack 1 win 5792 <nop,nop,timestamp 1027452346 431322340> (DF)
000069 10.50.6.53.34990 > 10.50.6.158.9090: . ack 11662 win 5840 <nop,nop,timestamp 431322340 1027452346> (DF) 001936 10.50.6.158.9090 > 10.50.6.53.34990: P 11662:11685(23) ack 1 win 5792 <nop,nop,timestamp 1027452346 431322340> (DF)
000068 10.50.6.53.34990 > 10.50.6.158.9090: . ack 11685 win 5840 <nop,nop,timestamp 431322340 1027452346> (DF) 001934 10.50.6.158.9090 > 10.50.6.53.34990: P 11685:11708(23) ack 1 win 5792 <nop,nop,timestamp 1027452346 431322340> (DF)
000069 10.50.6.53.34990 > 10.50.6.158.9090: . ack 11708 win 5840 <nop,nop,timestamp 431322340 1027452346> (DF) 001936 10.50.6.158.9090 > 10.50.6.53.34990: P 11708:11731(23) ack 1 win 5792 <nop,nop,timestamp 1027452346 431322340> (DF)
002004 10.50.6.158.9090 > 10.50.6.53.34990: P 11731:11754(23) ack 1 win 5792 <nop,nop,timestamp 1027452347 431322340> (DF)
000670 10.50.6.53.34990 > 10.50.6.158.9090: P 1:2(1) ack 11708 win 5840 <nop,nop,timestamp 431322341 1027452346> (DF) 000098 10.50.6.158.9090 > 10.50.6.53.34990: . ack 2 win 5792 <nop,nop,timestamp 1027452347 431322341> (DF) 037077 10.50.6.53.34990 > 10.50.6.158.9090: . ack 11754 win 5840 <nop,nop,timestamp 431322345 1027452346> (DF) 000169 10.50.6.158.9090 > 10.50.6.53.34990: P 11754:12168(414) ack 2 win 5792 <nop,nop,timestamp 1027452350 431322345> (DF)
000004 10.50.6.158.9090 > 10.50.6.53.34990: P 12168:12191(23) ack 2 win 5792 <nop,nop,timestamp 1027452350 431322345> (DF)
000100 10.50.6.53.34990 > 10.50.6.158.9090: . ack 12168 win 6432 <nop,nop,timestamp 431322345 1027452350> (DF) 000020 10.50.6.53.34990 > 10.50.6.158.9090: . ack 12191 win 6432 <nop,nop,timestamp 431322345 1027452350> (DF) 001821 10.50.6.158.9090 > 10.50.6.53.34990: P 12191:12214(23) ack 2 win 5792 <nop,nop,timestamp 1027452351 431322345> (DF)
000072 10.50.6.53.34990 > 10.50.6.158.9090: . ack 12214 win 6432 <nop,nop,timestamp 431322345 1027452351> (DF) 001931 10.50.6.158.9090 > 10.50.6.53.34990: P 12214:12237(23) ack 2 win 5792 <nop,nop,timestamp 1027452351 431322345> (DF)
000068 10.50.6.53.34990 > 10.50.6.158.9090: . ack 12237 win 6432 <nop,nop,timestamp 431322345 1027452351> (DF) 001935 10.50.6.158.9090 > 10.50.6.53.34990: P 12237:12260(23) ack 2 win 5792 <nop,nop,timestamp 1027452351 431322345> (DF)
Also, I think this is the same problem: http://www.icase.edu/coral/LinuxTCP.html
What I think I'd like, in order of desirability:
1) The congestion control to use segments instead of packets. Having 1 50-byte packet un-ACKed and the stream stopping seems a little excessive. Stopping the stream after 1 un-ACKed MTU seems reasonable.
2) A way to increase the number of outstanding packets to number greater than 1.
3) A way to disable the congestion control entirely.
If I'm totally mistaken, and I don't understand my problem, or what I think I need, please tell me. Any help would be appreciated.
Thanks, -Jon
------------------------------------------------------------------------
This message is for the designated recipient only and may
contain privileged or confidential information. If you have
received it in error, please notify the sender immediately
and delete the original. Any other use of the email is prohibited.
-----------------------------------------------------------------------!
-
: send the line "unsubscribe linux-net" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
-- Casey Carter Casey@Carter.net ccarter@cs.uiuc.edu AIM: cartec69
- : send the line "unsubscribe linux-net" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html