On 8/26/2013 3:15 PM, Brian Rak wrote:
I've been trying to track down the cause of some serious performance
issues with a Windows 2008R2 KVM guest. So far, I've been unable to
determine what exactly is causing the issue.
When the guest is under load, I see very high kernel CPU usage, as
well as terrible guest performance. The workload on the guest is
approximately 1/4 of what we'd run unvirtualized on the same
hardware. Even at that level, we max out every vCPU in the guest.
While the guest runs, I see very high kernel CPU usage (based on
`htop` output).
Host setup:
Linux nj1058 3.10.8-1.el6.elrepo.x86_64 #1 SMP Tue Aug 20 18:48:29 EDT
2013 x86_64 x86_64 x86_64 GNU/Linux
CentOS 6
qemu 1.6.0
2x Intel E5-2630 (virtualization extensions turned on, total of 24
cores including hyperthread cores)
24GB memory
swap file is enabled, but unused
Guest setup:
Windows Server 2008R2 (64 bit)
24 vCPUs
16 GB memory
VirtIO disk and network drivers installed
/qemu16/bin/qemu-system-x86_64 -name VMID100 -S -machine
pc-i440fx-1.6,accel=kvm,usb=off -cpu
host,hv_relaxed,hv_vapic,hv_spinlocks=0x1000 -m 15259 -smp
24,sockets=1,cores=12,threads=2 -uuid
90301200-8d47-6bb3-0623-bed7c8b1dd7c -no-user-config -nodefaults
-chardev
socket,id=charmonitor,path=/libvirt111/var/lib/libvirt/qemu/VMID100.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=readline -rtc
base=utc,driftfix=slew -no-hpet -boot c -usb -drive
file=/dev/vmimages/VMID100,if=none,id=drive-virtio-disk0,format=raw,cache=writeback,aio=native
-device
virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0
-drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw
-device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0
-netdev tap,fd=18,id=hostnet0,vhost=on,vhostfd=19 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:00:2c:6d,bus=pci.0,addr=0x3
-vnc 127.0.0.1:100 -k en-us -vga cirrus -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
The beginning of `perf top` output:
Samples: 62M of event 'cycles', Event count (approx.): 642019289177
64.69% [kernel] [k] _raw_spin_lock
2.59% qemu-system-x86_64 [.] 0x00000000001e688d
1.90% [kernel] [k] native_write_msr_safe
0.84% [kvm] [k] vcpu_enter_guest
0.80% [kernel] [k] __schedule
0.77% [kvm_intel] [k] vmx_vcpu_run
0.68% [kernel] [k] effective_load
0.65% [kernel] [k] update_cfs_shares
0.62% [kernel] [k] _raw_spin_lock_irq
0.61% [kernel] [k] native_read_msr_safe
0.56% [kernel] [k] enqueue_entity
I've captured 20,000 lines of kvm trace output. This can be found
https://gist.github.com/devicenull/fa8f49d4366060029ee4/raw/fb89720d34b43920be22e3e9a1d88962bf305da8/trace
So far, I've tried the following with very little effect:
* Disable HPET on the guest
* Enable hv_relaxed, hv_vapic, hv_spinlocks
* Enable SR-IOV
* Pin vCPUs to physical CPUs
* Forcing x2apic enabled in the guest (bcdedit /set x2apicpolicy yes)
* bcdedit /set useplatformclock yes and no
Any suggestions as to what I can do to get better performance out of
ths guest? Or reasons why I'm seeing such high kernel cpu usage with it?
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
I've done some additional research on this, and I believe that 'kvm_pio:
pio_read at 0xb008 size 4 count 1' is related to windows trying to read
the pm timer. This timer appears to use the TSC in some cases (I
think). I found this patchset:
http://www.spinics.net/lists/kvm/msg91214.html which doesn't appear to
be applied yet. Does it seem reasonable that this patchset would
eliminate the need for windows to read from the pm timer continuously?
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html