H.K. Jerry Chu wrote:
Yes now it hits the limit (942Mbps) after tcp_wmem[2] was increased to 40MB. I'm surprised it needs to be so big (B*D is only < 19MB). To recap, for delay=100ms recv tcp_rmem[2]=20MB, xmit side tcp_wmem[2] needs to be larger than 24MB to attain the line rate. For delay=150ms recv tcp_rmem[2]=30MB, xmit side tcp_wmem[2] needs to be larger than 35MB to attain the line rate.
So it seems. I'd like to note for the list that it's not really autotuning that's the issue, but that the sendbuf just needs to be 35MB. (Manual tuning was setting it to 40MB.)
Is there a simple formula for what the appropriate values for tcp_w/rmem[2] should be given a BD product?
There is -- one BDP for normal operation, two for recovery. But something slightly odd is going on. (BTW, rmem is slightly trickier. It's usually about 4/3*BDP, but can be higher depending on your driver and MTU.)
I think that what's going on is that since you're using netem, tcp is double-counting the cloned skb's sitting in the txqueue. Since pretty much the full BDP is sitting in your txqueue, you're needing about 2xBDP. I think this double-counting isn't strictly necessary. Can anyone think of a good reason why this is done (skb_set_owner_w() in tcp_transmit_skb())? It's late and my brain may not be working right.. ;)
-John - To unsubscribe from this list: send the line "unsubscribe linux-net" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html