I've been trying to understand why the performance from guest to guest over a 10GbE link using virtio, as measured by netperf, dramatically decreases when the socket buffer size is increased on the receiving guest. This is an Intel X3210 4-core 2.13GHz system running RHEL5.4. I don't see this drop in performance when going from guest to host or host to guest over the 10GbE link. Here are the results from netperf: Default socket buffer sizes: Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 16384 60.01 2268.47 47.69 99.95 1.722 3.609 Receiver 256K socket buffer size (actually rmem_max * 2): Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 262142 16384 16384 60.00 1583.75 39.00 74.09 2.018 3.832 There is increased idle time in the receiver. Using systemtap I found that the idle time is because we are waiting for data (tcp_recvmsg calling sk_wait_data). I instrumented qemu on the receiver side to print out some statistics related to xmit/recv events. "Rx-Could not receive" is incremented whenever "do_virtio_net_can_receive" returns 0 "Rx-Ring full" is incremented in "do_virtio_net_can_receive" whenever there are no available entries/space in the receive ring "Rx-Count" is incremented whenever "virtio_net_receive2" is called (and can receive data) "Rx-Bytes" is increased in "virtio_net_receive2" by the number of bytes to be read from the tap device "Rx-Ring buffers" is increased by the number of buffers used for the data in "virtio_net_receive2" "Tx-Notify" is incremented whenever "virtio_net_handle_tx" is invoked "Tx-Sched BH" is incremented whenever "virtio_net_handle_tx" is invoked and the the qemu_bh hasn't been scheduled yet "Tx-Packets" is incremented in "virtio_net_flush_tx" whenever a packet is removed from the transmit ring and sent to qemu "Tx-Bytes" is increased in "virtio_net_flush_tx" by the number of bytes sent to qemu. Here are the stats for the two cases: Default 256K Rx-Could not receive 3,559 0 Rx-Ring full 3,559 0 Rx-Count 1,063,056 805,012 Rx-Bytes 18,131,704,980 12,593,270,826 Rx-Ring buffers 4,963,793 3,541,010 Tx-Notify 125,068 125,702 Tx-Sched BH 125,068 125,702 Tx-Packets 147,256 232,219 Tx-Bytes 11,486,448 18,113,586 Dividing the Tx-Bytes by Tx-Packets in each case yields about 78 bytes/packet so these are most likely ACKs. But why am I seeing almost 85,000 more of these in the 256K socket buffer case? Also, dividing the Rx-Bytes by the Rx- Count shows that the tap device is delivering about 1413 bytes less per call to qemu in the 256K socket buffer case. Does anyone have some insight as to what is happening? Thanks, Tom Lendacky -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html