> Subject: Re: Network throughput limits for local VM <-> VM > communication > > On 06/17/2009 10:36 AM, Fischer, Anna wrote: > > > > /usr/bin/qemu-system-x86_64 -m 1024 -smp 2 -name FC10-2 -uuid > b811b278-fae2-a3cc-d51d-8f5b078b2477 -boot c -drive > file=,if=ide,media=cdrom,index=2 -drive > file=/var/lib/libvirt/images/FC10-2.img,if=virtio,index=0,boot=on -net > nic,macaddr=54:52:00:11:ae:79,model=e1000 -net tap net > nic,macaddr=54:52:00:11:ae:78,model=e1000 -net tap -serial pty - > parallel none -usb -vnc 127.0.0.1:2 -k en-gb -soundhw es1370 > > > > > > Okay, like I suspected, qemu has a trap here and you walked into it. > The -net option plugs the device you specify into a virtual hub. The > command line you provided plugs the two virtual NICs and the two tap > devices into one virtual hub, so any packet received from any of the > four clients will be propagated to the other three. > > To get this to work right, specify the vlan= parameter which says which > virtual hub a component is plugged into. Note this has nothing to do > with 802.blah vlans. > > So your command line should look like > > qemu ... -net nic,...,vlan=0 -net tap,...,vlan=0 -net > nic,...,vlan=1 > -net tap,...,vlan=1 > > This will give you two virtual hubs, each bridging a virtual nic to a > tap device. > > > This is my "routing VM" that has two network interfaces and routes > packets between two subnets. It has one interface plugged into bridge > virbr0 and the other interface is plugged into virbr1: > > > > brctl show > > bridge name bridge id STP enabled interfaces > > virbr0 8000.8ac1d18c63ec no vnet0 > > vnet1 > > virbr1 8000.2ebfcbb9ed70 no vnet2 > > vnet3 > > > > Please redo the tests with qemu vlans but without 802.blah vlans, so we > see what happens without packet duplication. Avi, thanks for your quick reply. I do use the vlan= parameter now, and yes, I do not see packet duplication any more, so everything you said is right and I do understand now why I was seeing packets on both bridges before. So this has nothing to do with tun/tap then but just with the way QEMU "virtual hubs" work. I didn't know about any details on that before. Even with vlan= enabled, I am still having the same issues with weird CPU utilization and low throughput that I have described below. > > If I use the e1000 virtual NIC model, I see performance drop > significantly compared to using virtio_net. However, with virtio_net I > have the network stalling after a few seconds of high-throughput > traffic (as I mentioned in my previous post). Just to reiterate my > scenario: I run three guests on the same physical machine, one guest is > my routing VM that is routing IP network traffic between the other two > guests. > > > > I am also wondering about the fact that I do not seem to get CPU > utilization maxed out in this case while throughput does not go any > higher. I do not understand what is stopping KVM from using more CPU > for guest I/O processing? There is nothing else running on my machine. > I have analyzed the amount of CPU that each KVM thread is using, and I > can see that the thread that is running the VCPU of the routing VM > which is processing interrupts of the e1000 virtual network card is > using the highest amount of CPU. Is there any way that I can optimize > my network set-up? Maybe some specific configuration of the e1000 > driver within the guest? Are there any known issues with this? > > > > There are known issues with lack of flow control while sending packets > out of a guest. If the guest runs tcp that tends to correct for it, > but > if you run a lower level protocol that doesn't have its own flow > control, the guest may spend a lot of cpu generating packets that are > eventually dropped. We are working on fixing this. For the tests I run now (with vlan= enabled) I am actually using both TCP and UDP, and I see the problem with virtio_net for both protocols. What I am wondering about though is that I do not seem to have any problems if I communicate directly between the two guests (if I plug then into the same bridge and put them onto the same networks), so why do I only see the problem of stalling network communication when there is a routing VM in the network path? Is this just because the system is even more overloaded in that case? Or could this be an issue related to a dual NIC configuration or the fact that I run multiple bridges on the same physical machine? When you say "We are working on fixing this." - which code parts are you working on? Is this in the QEMU network I/O processing code or is this virtio_net related? > > I also see very difference CPU utilization and network throughput > figures when pinning threads to CPU cores using taskset. At one point I > managed to double the throughput, but I could not reproduce that setup > for some reason. What are the major issues that I would need to pay > attention to when pinning threads to cores in order to optimize my > specific set-up so that I can achieve better network I/O performance? > > > > It's black magic, unfortunately. But please retry with the fixed > configuration and we'll continue from there. Retry with "the fixed configuration"? You mean setting the vlan= parameter? I have already used the vlan= parameter for the latest tests, and so the CPU utilization issues I am talking about are happening with that configuration. Thanks, Anna ��.n��������+%������w��{.n�����o�^n�r������&��z�ޗ�zf���h���~����������_��+v���)ߣ�m