On Wed, Apr 18, 2012 at 09:44:47PM -0700, Chegu Vinod wrote: > On 4/17/2012 6:25 AM, Chegu Vinod wrote: > >On 4/17/2012 2:49 AM, Gleb Natapov wrote: > >>On Mon, Apr 16, 2012 at 07:44:39AM -0700, Chegu Vinod wrote: > >>>On 4/16/2012 5:18 AM, Gleb Natapov wrote: > >>>>On Thu, Apr 12, 2012 at 02:21:06PM -0400, Rik van Riel wrote: > >>>>>On 04/11/2012 01:21 PM, Chegu Vinod wrote: > >>>>>>Hello, > >>>>>> > >>>>>>While running an AIM7 (workfile.high_systime) in a > >>>>>>single 40-way (or a single > >>>>>>60-way KVM guest) I noticed pretty bad performance when > >>>>>>the guest was booted > >>>>>>with 3.3.1 kernel when compared to the same guest booted > >>>>>>with 2.6.32-220 > >>>>>>(RHEL6.2) kernel. > >>>>>>For the 40-way Guest-RunA (2.6.32-220 kernel) performed > >>>>>>nearly 9x better than > >>>>>>the Guest-RunB (3.3.1 kernel). In the case of 60-way > >>>>>>guest run the older guest > >>>>>>kernel was nearly 12x better ! > >>>>How many CPUs your host has? > >>>80 Cores on the DL980. (i.e. 8 Westmere sockets). > >>> > >>So you are not oversubscribing CPUs at all. Are those real cores > >>or including HT? > > > >HT is off. > > > >>Do you have other cpus hogs running on the host while testing the guest? > > > >Nope. Sometimes I do run the utilities like "perf" or "sar" or > >"mpstat" on the numa node 0 (where > >the guest is not running). > > > >> > >>>I was using numactl to bind the qemu of the 40-way guests to numa > >>>nodes : 4-7 ( or for a 60-way guest > >>>binding them to nodes 2-7) > >>> > >>>/etc/qemu-ifup tap0 > >>> > >>>numactl --cpunodebind=4,5,6,7 --membind=4,5,6,7 > >>>/usr/local/bin/qemu-system-x86_64 -enable-kvm -cpu Westmere,+rdtscp,+pdpe1gb,+dca,+xtpr,+tm2,+est,+vmx,+ds_cpl,+monitor,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme > >>>-enable-kvm \ > >>>-m 65536 -smp 40 \ > >>>-name vm1 -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/vm1.monitor,server,nowait > >>>\ > >>>-drive file=/var/lib/libvirt/images/vmVinod1/vm1.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none > >>>-device virtio-blk-pci,scsi=off,bus=pci > >>>.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \ > >>>-monitor stdio \ > >>>-net nic,macaddr=<..mac_addr..> \ > >>>-net tap,ifname=tap0,script=no,downscript=no \ > >>>-vnc :4 > >>> > >>>/etc/qemu-ifdown tap0 > >>> > >>> > >>>I knew that there will be a few additional temporary qemu worker > >>>threads created... i.e. some over > >>>subscription will be there. > >>> > >>4 nodes above have 40 real cores, yes? > > > >Yes . > >Other than the qemu's related threads and some of the generic > >per-cpu Linux kernel threads (e.g. migration etc) > >there isn't anything else running on these Numa nodes. > > > >>Can you try to run upstream > >>kernel without binding at all and check the performance? > > > > Re-ran the same workload *without* binding the qemu...but using the > 3.3.1 kernel > > 20-way guest: Performance got much worse when compared to the case > where bind the qemu. > 40-way guest: about the same as in the case where we bind the qemu > 60-way guest: about the same as in the case where we bind the qemu > > Trying out a couple of other experiments... > With 8 sockets the numa effects are probably very strong. Couple of things to try: 1. Run vm that fits into one numa node and bind it to a numa node. Compare performance of rhel kernel and upstream. 2. Run vm bigger than numa node, bind vcpus to numa nodes separately and pass resulted topology to a guest using -numa flag. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html