On Tue, Jan 25, 2011 at 03:09:34PM -0600, Steve Dobbelstein wrote: > > I am working on a KVM network performance issue found in our lab running > the DayTrader benchmark. The benchmark throughput takes a significant hit > when running the application server in a KVM guest verses on bare metal. > We have dug into the problem and found that DayTrader's use of small > packets exposes KVM's overhead of handling network packets. I have been > able to reproduce the performance hit with a simpler setup using the > netperf benchmark with the TCP_RR test and the request and response sizes > set to 256 bytes. I run the benchmark between two physical systems, each > using a 1GB link. In order to get the maximum throughput for the system I > have to run 100 instances of netperf. When I run the netserver processes > in a guest, I see a maximum throughput that is 51% of what I get if I run > the netserver processes directly on the host. The CPU utilization in the > guest is only 85% at maximum throughput, whereas it is 100% on bare metal. You are stressing the scheduler pretty hard with this test :) Is your real benchmark also using a huge number of threads? If it's not, you might be seeing a different issue. IOW, the netperf degradation might not be network-related at all, but have to do with speed of context switch in guest. Thoughts? > The KVM host has 16 CPUs. The KVM guest is configured with 2 VCPUs. When > I run netperf on the host I boot the host with maxcpus=2 on the kernel > command line. The host is running the current KVM upstream kernel along > with the current upstream qemu. Here is the qemu command used to launch > the guest: > /build/qemu-kvm/x86_64-softmmu/qemu-system-x86_64 -name glasgow-RH60 -m 32768 -drive file=/build/guest-data/glasgow-RH60.img,if=virtio,index=0,boot=on > -drive file=/dev/virt/WAS,if=virtio,index=1 -net nic,model=virtio,vlan=3,macaddr=00:1A:64:E5:00:63,netdev=nic0 -netdev tap,id=nic0,vhost=on -smp 2 > -vnc :1 -monitor telnet::4499,server,nowait -serial telnet::8899,server,nowait --mem-path /libhugetlbfs -daemonize > > We have tried various proposed fixes, each with varying amounts of success. > One such fix was to add code to the vhost thread such that when it found > the work queue empty it wouldn't just exit the thread but rather would > delay for 50 microseconds and then recheck the queue. If there was work on > the queue it would loop back and process it, else it would exit the thread. > The change got us a 13% improvement in the DayTrader throughput. > > Running the same netperf configuration on the same hardware but using a > different hypervisor gets us significantly better throughput numbers. The > guest on that hypervisor runs at 100% CPU utilization. The various fixes > we have tried have not gotten us close to the throughput seen on the other > hypervisor. I'm looking for ideas/input from the KVM experts on how to > make KVM perform better when handling small packets. > > Thanks, > Steve > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html