Re: Windows Server 2008 VM performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andrew Theurer wrote:
I've been looking at how KVM handles windows guests, and I am a little concerned with the CPU overhead. My test case is as follows:

I am running 4 instances of a J2EE benchmark. Each instance needs one application server and one DB server. 8 VMs in total are used.

I have the same App and DB software for Linux and Windows (and same versions) so I can compare between Linux and Windows. I also have another hypervisor which I can test both Windows and Linux VMs.

The host has EPT capable processors. VMs in KVM are backed with large pages.


Test results:

Config                                CPU utilization
------                                -----
KVM-85
  Windows Server 2008 64-bit VMs     44.84
  RedHat 5.3 w/ 2.6.29 64-bit VMs    24.56
Other-Hypervisor
  Windows Server 2008 64-bit VMs     30.63
  RedHat 5.3 w/ 2.6.18 64-bit VMs    27.13

-KVM running Windows VMs uses 46% more CPU than the Other-Hypervisor
-The Other-Hypervisor provides an optimized virtual network driver
-KVM results listed above did not use virtio_net or virtio_disk for Windows, but do for Linux -One extra KVM run (not listed above) was made with virtio_net for Windows VMs but only reduced CPU by 2%

I think the publicly available driver is pretty old and unoptimized.

-Most of the CPU overhead could be attributed to the DB VMs, where there is about 5 MB/sec writes per VM
-I don't have a virtio_block driver for Windows to test.  Does one exist?

Exists, not public though.

-All tests above had 2 vCPUS per VM


Here's a comparison of kvm_stat between Windows (run1) and Linux (run2):

                run1          run2        run1/run2
                ----          ----        ---------
efer_relo:          0             0         1
exits    :    1206880        121916         9.899

total exits is the prime measure of course.

fpu_reloa:     210969         20863        10.112
halt_exit:      15092         13222         1.141
halt_wake:      14466          9294         1.556
host_stat:     211066         45117         4.678

host state reloads measure (approximately) exits to userspace, likely due to the unoptimized drivers.

hypercall:          0             0         1
insn_emul:     119582         38126         3.136

again lack of drivers

insn_emul:          0             0         1
invlpg   :          0             0         1
io_exits :     131051         26349         4.974

ditto

irq_exits:       8128         12937         0.628
irq_injec:      29955         21825         1.373
irq_windo:       2504          2022         1.238
kvm_reque:          0             0         1
largepage:          1            64         0.009
mmio_exit:      59224             0           Inf

wow, linux avoids mmio completely.  good.



10x the number of exits, a problem?

_the_ problem.


I happened to try just one vCPU per VM for KVM/Windows VMs, and I was surprised how much of a difference it made:

Config                                               CPU utilization
------                                               -----
KVM-85
  Windows Server 2008 64-bit VMs, 2 vCPU per VM     44.84
  Windows Server 2008 64-bit VMs, 1 vCPU per VM     36.44

A 19% reduction in CPU utilization vs KVM/Windows-2vCPU! Does not explain all the overhead (vs Other-Hypervisor, 2 vCPUs per VM) but, that sure seems like a lot between 1 to 2 vCPUs for KVM/Windows-VMs.

Inter-process communication is expensive. Marcelo added some optimizations (the sending vcpu used to wait for the target vcpu, now they don't). They're in kvm-86 (of course). You'll need 2.6.26+ on the host for them to take effect (of course, for many features, the more recent the host the faster kvm on top can run).

Windows 2008 also implements some hypervisor accelerations which are especially useful on smp. Gleb has started some work on this. I don't know if other_hypervisor implements them.

Finally, smp is expensive! data moves across caches, processors wait for each other, etc.

I have not run with 1 vCPU per VM with Other-Hypervisor, but I will soon. Anyway, I also collected kvm_stat for the 1 vCPU case, and here it is compared to KVM/Linux VMs with 2 vCPUs:

                run1          run2        run1/run2
                ----          ----        ---------
efer_relo:          0             0         1
exits    :    1184471        121916         9.715

Still see the huge difference in vm_exits, so I guess not all is great yet.

Yeah, exit rate stayed the same, so it's probably IPC costs and intrinsic smp costs.

So, what would be the best course of action for this?


Is there a virtio_block driver to test?

There is, but it isn't available yet.

Can we find the root cause of the exits (is there a way to get stack dump or something that can show where there are coming from)?

Marcelo is working on a super-duper easy to use kvm trace which can show what's going on. The old one is reasonably easy though it exports less data. If you can generate some traces, I'll have a look at them.


P.S. Here is the qemu cmd line for the windows VMs:
/usr/local/bin/qemu-system-x86_64 -name newcastle-xdbt01 -hda /dev/disk/by-id/scsi-3600a0b80000f1eb10000074f4a02b08a

Use: -drive /dev/very-long-name,cache=none instead of -hda to disable the host cache. Won't make much of a differnence for 5 MB/s though.



--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux