On Tue, 2010-09-14 at 18:14 +0200, Thibault VINCENT wrote: > On 13/09/2010 19:34, Alex Williamson wrote: > > On Mon, Sep 13, 2010 at 4:32 AM, Thibault VINCENT > > <thibault.vincent@xxxxxxxxxxxx> wrote: > >> Hello > >> > >> I'm trying to achieve higher than gigabit transferts over a virtio NIC > >> with no success, and I can't find a recent bug or discussion about such > >> an issue. > >> > >> The simpler test consist of two VM running on a high-end blade server > >> with 4 cores and 4GB RAM each, and a virtio NIC dedicated to the > >> inter-VM communication. On the host, the two vnet interfaces are > >> enslaved into a bridge. I use a combination of 2.6.35 on the host and > >> 2.6.32 in the VMs. > >> Running iperf or netperf on these VMs, with TCP or UDP, result in > >> ~900Mbits/s transferts. This is what could be expected of a 1G > >> interface, and indeed the e1000 emulation performs similar. > >> > >> Changing the txqueuelen, MTU, and offloading settings on every interface > >> (bridge/tap/virtio_net) didn't improve the speed, nor did the > >> installation of irqbalance and the increase in CPU and RAM. > >> > >> Is this normal ? Is the multiple queue patch intended to address this ? > >> It's quite possible I missed something :) > > > > I'm able to achieve quite a bit more than 1Gbps using virtio-net > > between 2 guests on the same host connected via an internal bridge. > > With the virtio-net TX bottom half handler I can easily hit 7Gbps TCP > > and 10+Gbps UDP using netperf (TCP_STREAM/UDP_STREAM tests). Even > > without the bottom half patches (not yet in qemu-kvm.git), I can get > > ~5Gbps. Maybe you could describe your setup further, host details, > > bridge setup, guests, specific tests, etc... Thanks, > > Thanks Alex, I don't use the bottom half patches but anything between > 3Gbps and 5Gbps would be fine. Here are some more details: > > Host > ----- > Dell M610 ; 2 x Xeon X5650 ; 6 x 8GB > Debian Squeeze amd64 > qemu-kvm 0.12.5+dfsg-1 > kernel 2.6.35-1 amd64 (Debian Experimental) > > Guests > ------- > Debian Squeeze amd64 > kernel 2.6.35-1 amd64 (Debian Experimental) > > To measure the throughput between the guests, I do the following. > > On the host: > * create a bridge > # brctl addbr br_test > # ifconfig br_test 1.1.1.1 up > * start two guests > # kvm -enable-kvm -m 4096 -smp 4 -drive > file=/dev/vg/deb0,id=0,boot=on,format=raw -device > virtio-blk-pci,drive=0,id=0 -device > virtio-net-pci,vlan=0,id=1,mac=52:54:00:cf:6a:b0 -net > tap,vlan=0,name=hostnet0 > # kvm -enable-kvm -m 4096 -smp 4 -drive > file=/dev/vg/deb1,id=0,boot=on,format=raw -device > virtio-blk-pci,drive=0,id=0 -device > virtio-net-pci,vlan=0,id=1,mac=52:54:00:cf:6a:b1 -net > tap,vlan=0,name=hostnet0 > * add guests to the bridge > # brctl addif br_test tap0 > # brctl addif br_test tap1 > > On the first guest: > # ifconfig eth0 1.1.1.2 up > # iperf -s -i 1 > > On the second guest: > # ifconfig eth0 1.1.1.3 up > # iperf -c 1.1.1.2 -i 1 > ------------------------------------------------------------ > Client connecting to 1.1.1.2, TCP port 5001 > TCP window size: 16.0 KByte (default) > ------------------------------------------------------------ > [ 3] local 1.1.1.3 port 43510 connected with 1.1.1.2 port 5001 > [ ID] Interval Transfer Bandwidth > [ 3] 0.0- 1.0 sec 80.7 MBytes 677 Mbits/sec > [ 3] 1.0- 2.0 sec 102 MBytes 855 Mbits/sec > [ 3] 2.0- 3.0 sec 101 MBytes 847 Mbits/sec > [ 3] 3.0- 4.0 sec 104 MBytes 873 Mbits/sec > [ 3] 4.0- 5.0 sec 104 MBytes 874 Mbits/sec > [ 3] 5.0- 6.0 sec 105 MBytes 881 Mbits/sec > [ 3] 6.0- 7.0 sec 103 MBytes 862 Mbits/sec > [ 3] 7.0- 8.0 sec 101 MBytes 848 Mbits/sec > [ 3] 8.0- 9.0 sec 105 MBytes 878 Mbits/sec > [ 3] 9.0-10.0 sec 105 MBytes 882 Mbits/sec > [ 3] 0.0-10.0 sec 1011 MBytes 848 Mbits/sec > > On the host again: > # iperf -c 1.1.1.1 -i 1 > ------------------------------------------------------------ > Client connecting to 1.1.1.1, TCP port 5001 > TCP window size: 16.0 KByte (default) > ------------------------------------------------------------ > [ 3] local 1.1.1.3 port 60456 connected with 1.1.1.1 port 5001 > [ ID] Interval Transfer Bandwidth > [ 3] 0.0- 1.0 sec 97.9 MBytes 821 Mbits/sec > [ 3] 1.0- 2.0 sec 136 MBytes 1.14 Gbits/sec > [ 3] 2.0- 3.0 sec 153 MBytes 1.28 Gbits/sec > [ 3] 3.0- 4.0 sec 160 MBytes 1.34 Gbits/sec > [ 3] 4.0- 5.0 sec 156 MBytes 1.31 Gbits/sec > [ 3] 5.0- 6.0 sec 122 MBytes 1.02 Gbits/sec > [ 3] 6.0- 7.0 sec 121 MBytes 1.02 Gbits/sec > [ 3] 7.0- 8.0 sec 137 MBytes 1.15 Gbits/sec > [ 3] 8.0- 9.0 sec 139 MBytes 1.17 Gbits/sec > [ 3] 9.0-10.0 sec 140 MBytes 1.17 Gbits/sec > [ 3] 0.0-10.0 sec 1.33 GBytes 1.14 Gbits/sec > > > You can see it's quite slow compared to your figures, between the guests > and with the host too. And there is no specific load on any of the three > systems, htop in a guest only report one of the four cores going up to > 70% (sys+user+wait) during the test. > > The other tests I mentioned are: > * iperf or netperf over UDP : maybe 10% faster, no more > * interface settings : very very few effect > # ifconfig [br_test,tap0,tap1,eth0] txqueuelen 20000 > # ifconfig eth0 mtu 65534 <-- guest only > # ethtool -K eth0 gso on <-- guest only > * double the RAM or number of CPU in the guests, no effect > * run the two guests on separate hosts, linked with a 10G net, again > it's exactly the same throughput. I can get >7Gbps between the hosts. > > Then I reproduced the test on a completely different system, a desktop > with Intel i5, 4GB, Debian Squeeze. And unfortunately I get the same > figures, so the limitation doesn't seem to be hardware bounded. > > What's your distro, kernel, and kvm version Alex ? Do you think I need > to compile qemu with a specific patch or option that may be missing in > the Squeeze source ? Do you by chance see better performance if you test a freshly booted guest with less than 5 minutes of uptime? I see a pretty significant drop in throughput ramp-up at the 5 minute mark that I haven't been able to explain. On qemu-kvm, this can keep me below 1Gbps for 1s tests, but I mostly get up to normal throughput by 10s. It might be worthwhile for you to build your own qemu-kvm and maybe even incorporate the bottom half TX patches. These should be in the qemu-kvm tree whenever the next qemu merge happens (very soon). With those, my peak throughput is ~3x higher. Also, for best performance pay attention to cpufreq on the host. You might want to use the performance governor for testing. You shouldn't need any special configure options for qemu-kvm.git, I typically only use the --prefix option. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html