OK guys Using a mlx4 testbed I can reproduce the problem by pushing coalescing settings and disabling SG (thus disabling GSO) ethtool -K eth0 sg off Actual changes: scatter-gather: off tx-scatter-gather: off generic-segmentation-offload: off [requested on] ethtool -C eth0 tx-usecs 1024 tx-frames 64 Meaning that NIC waits one ms before sending the TX IRQ, and can accumulate 64 frames before forcing the interrupt. We probably have a bug in cwnd expansion logic : lpaa23:~# DUMP_TCP_INFO=1 ./netperf -H 10.246.7.152 -Cc MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET rto=201000 ato=0 pmtu=1500 rcv_ssthresh=29200 rtt=230 rttvar=30 snd_ssthresh=41 cwnd=59 reordering=3 total_retrans=1 ca_state=0 pacing_rate=5943.1 Mbits Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 16384 10.00 530.39 0.40 0.32 2.965 2.398 -> final cwnd=59 which is not enough to avoid the 1ms delay between each burst. So sender sends ~60 packets, then has to wait 1ms (to get NIC TX IRQ) before sending the following burst. I am CCing Neal, he probably can help to root cause the problem. Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html