Re: TCP socket buffer autotuning

John Heffner <jheffner@xxxxxxx> · Fri, 20 Oct 2006 23:11:15 -0400

H.K. Jerry Chu wrote:
Yes now it hits the limit (942Mbps) after tcp_wmem[2] was increased to
40MB. I'm surprised it needs to be so big (B*D
is only < 19MB).

To recap, for delay=100ms recv tcp_rmem[2]=20MB, xmit side
tcp_wmem[2] needs to be larger than 24MB to attain the line
rate. For delay=150ms recv tcp_rmem[2]=30MB, xmit side
tcp_wmem[2] needs to be larger than 35MB to attain the
line rate.

So it seems.  I'd like to note for the list that it's not really 
autotuning that's the issue, but that the sendbuf just needs to be 35MB. 
 (Manual tuning was setting it to 40MB.)

Is there a simple formula for what the appropriate values
for tcp_w/rmem[2] should be given a BD product?

There is -- one BDP for normal operation, two for recovery.  But 
something slightly odd is going on.  (BTW, rmem is slightly trickier. 
It's usually about 4/3*BDP, but can be higher depending on your driver 
and MTU.)

I think that what's going on is that since you're using netem, tcp is 
double-counting the cloned skb's sitting in the txqueue.  Since pretty 
much the full BDP is sitting in your txqueue, you're needing about 
2xBDP.  I think this double-counting isn't strictly necessary.  Can 
anyone think of a good reason why this is done (skb_set_owner_w() in 
tcp_transmit_skb())?  It's late and my brain may not be working right.. ;)

  -John

-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html