On 10/20/06, John Heffner <jheffner@xxxxxxx> wrote:
H.K. Jerry Chu wrote: > So I increased tcp_r/wmem[2] to 30MB from 20MB and throughput > does improve to ~800Mbps. > > Then I discovered this is a sender side only problem as well, just > like the 100ms case. If I specify -w 20m to iperf on the xmit side > ONLY, throughput hit ~945Mbps. If the app requests 20m, the kernel will try to give 40m. What does iperf actually tell you what you got?
40MB.
If you increase tcp_wmem to 40m does it change the behavior?
Yes now it hits the limit (942Mbps) after tcp_wmem[2] was increased to 40MB. I'm surprised it needs to be so big (B*D is only < 19MB). To recap, for delay=100ms recv tcp_rmem[2]=20MB, xmit side tcp_wmem[2] needs to be larger than 24MB to attain the line rate. For delay=150ms recv tcp_rmem[2]=30MB, xmit side tcp_wmem[2] needs to be larger than 35MB to attain the line rate. Is there a simple formula for what the appropriate values for tcp_w/rmem[2] should be given a BD product? Jerry
>>> Thanks for the prompt reply. There is no pkt loss, none, at either >>> netem, pfifo_fast. or TCP level (from netstat -s -t). This is a >>> back2back >>> 1GbE link with one iperf test running. >> >> What's the output of tcptrace -rlW? > > Note that window scaler = 9 so you'll have to multiple max win adv, > e.g., by 512. Also the data is collected on the xmit side below > netem so I'm not sure how useful it is. You're right, it's not really useful taken below netem. FWIW, I run at full gigabit all the time with similar latency and I don't have a problem. The sender-side autotuning is so simple it's hard to imagine what could be wrong unless congestion control is doing something strange, or the limit is too low. The fact that you're not seeing any losses or txqueue overflows definitely means something is wrong. -John
- To unsubscribe from this list: send the line "unsubscribe linux-net" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html