Re: Performance of 40-way guest running 2.6.32-220 (RHEL6.2) vs. 3.3.1 OS

Gleb Natapov <gleb@xxxxxxxxxx> · Thu, 19 Apr 2012 09:01:18 +0300



On Wed, Apr 18, 2012 at 09:44:47PM -0700, Chegu Vinod wrote:
> On 4/17/2012 6:25 AM, Chegu Vinod wrote:
> >On 4/17/2012 2:49 AM, Gleb Natapov wrote:
> >>On Mon, Apr 16, 2012 at 07:44:39AM -0700, Chegu Vinod wrote:
> >>>On 4/16/2012 5:18 AM, Gleb Natapov wrote:
> >>>>On Thu, Apr 12, 2012 at 02:21:06PM -0400, Rik van Riel wrote:
> >>>>>On 04/11/2012 01:21 PM, Chegu Vinod wrote:
> >>>>>>Hello,
> >>>>>>
> >>>>>>While running an AIM7 (workfile.high_systime) in a
> >>>>>>single 40-way (or a single
> >>>>>>60-way KVM guest) I noticed pretty bad performance when
> >>>>>>the guest was booted
> >>>>>>with 3.3.1 kernel when compared to the same guest booted
> >>>>>>with 2.6.32-220
> >>>>>>(RHEL6.2) kernel.
> >>>>>>For the 40-way Guest-RunA (2.6.32-220 kernel) performed
> >>>>>>nearly 9x better than
> >>>>>>the Guest-RunB (3.3.1 kernel). In the case of 60-way
> >>>>>>guest run the older guest
> >>>>>>kernel was nearly 12x better !
> >>>>How many CPUs your host has?
> >>>80 Cores on the DL980.  (i.e. 8 Westmere sockets).
> >>>
> >>So you are not oversubscribing CPUs at all. Are those real cores
> >>or including HT?
> >
> >HT is off.
> >
> >>Do you have other cpus hogs running on the host while testing the guest?
> >
> >Nope.  Sometimes I do run the utilities like "perf" or "sar" or
> >"mpstat" on the numa node 0 (where
> >the guest is not running).
> >
> >>
> >>>I was using numactl to bind the qemu of the 40-way guests to numa
> >>>nodes : 4-7  ( or for a 60-way guest
> >>>binding them to nodes 2-7)
> >>>
> >>>/etc/qemu-ifup tap0
> >>>
> >>>numactl --cpunodebind=4,5,6,7 --membind=4,5,6,7
> >>>/usr/local/bin/qemu-system-x86_64 -enable-kvm -cpu Westmere,+rdtscp,+pdpe1gb,+dca,+xtpr,+tm2,+est,+vmx,+ds_cpl,+monitor,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme
> >>>-enable-kvm \
> >>>-m 65536 -smp 40 \
> >>>-name vm1 -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/vm1.monitor,server,nowait
> >>>\
> >>>-drive file=/var/lib/libvirt/images/vmVinod1/vm1.img,if=none,id=drive-virtio-disk0,format=qcow2,cache=none
> >>>-device virtio-blk-pci,scsi=off,bus=pci
> >>>.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 \
> >>>-monitor stdio \
> >>>-net nic,macaddr=<..mac_addr..>  \
> >>>-net tap,ifname=tap0,script=no,downscript=no \
> >>>-vnc :4
> >>>
> >>>/etc/qemu-ifdown tap0
> >>>
> >>>
> >>>I knew that there will be a few additional temporary qemu worker
> >>>threads created...  i.e. some over
> >>>subscription  will be there.
> >>>
> >>4 nodes above have 40 real cores, yes?
> >
> >Yes .
> >Other than the qemu's related threads and some of the generic
> >per-cpu Linux kernel threads (e.g. migration  etc)
> >there isn't anything else running on these Numa nodes.
> >
> >>Can you try to run upstream
> >>kernel without binding at all and check the performance?
> >
> 
> Re-ran the same workload *without* binding the qemu...but using the
> 3.3.1 kernel
> 
> 20-way guest: Performance got much worse when compared to the case
> where bind the qemu.
> 40-way guest: about the same as in the case  where we bind the qemu
> 60-way guest: about the same as in the case  where we bind the qemu
> 
> Trying out a couple of other experiments...
> 
With 8 sockets the numa effects are probably very strong. Couple of things to
try:
1. Run vm that fits into one numa node and bind it to a numa node. Compare
   performance of rhel kernel and upstream.
2. Run vm bigger than numa node, bind vcpus to numa nodes separately and
   pass resulted topology to a guest using -numa flag.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html