Some KVM benchmarks/ IO question

RW <kvm@xxxxxxxxxxx> · Mon, 05 Oct 2009 22:38:32 +0200

I've currently a HP DL 380 G6 server which I can use for some benchmarks
I want to share with you. Maybe someone find it interesting or usefull.
Both host and guest running Gentoo with kernel 2.6.31.1 (which is stable
2.6.31 update 1). The host have the following components:

2 x Intel Xeon CPU L5520 - 2 Quad-Processors (static performance, VT-d,
Hyperthreading enabled in BIOS)
8 x 300 GB SAS 10k hard drives (RAID 10)
24 GB RAM

(Some) Host (kernel) settings:
I/O scheduler: deadline
Filesystem: xfs
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_KVM=m
CONFIG_KVM_INTEL=m
CONFIG_VIRTIO_BLK=m
CONFIG_VIRTIO_NET=m
CONFIG_VIRTIO_CONSOLE=m
CONFIG_HW_RANDOM_VIRTIO=m
CONFIG_VIRTIO=m
CONFIG_VIRTIO_RING=m
CONFIG_VIRTIO_PCI=m
CONFIG_VIRTIO_BALLOON=m

(Some) Guest (kernel) settings:
I/O scheduler: deadline/cfq (see below)
Filesystem: ext3 (datamode=ordered/writeback [see below))
VIRTIO network and block driver used
Guest is a qcow2-image (which was expanded with "dd" to be big
enough for the IO tests so that growing the image doesn't happen
during the IO test.
CONFIG_KVM_CLOCK=y
CONFIG_KVM_GUEST=y
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_VIRTIO_BLK=y
CONFIG_VIRTIO_NET=m
CONFIG_VIRTIO_CONSOLE=y
CONFIG_HW_RANDOM_VIRTIO=y
CONFIG_VIRTIO=y
CONFIG_VIRTIO_RING=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_BALLOON=y
CONFIG_PARAVIRT_SPINLOCKS=y

KVM startup options (the imported ones):
-m <variable-see_below>
-smp <variable-see_below>
-cpu host
-daemonize
-drive file=/data/kvm/kvmimages/gfs1.qcow2,if=virtio,boot=on
-net nic,vlan=104,model=virtio,macaddr=00:ff:48:23:45:4b
-net tap,vlan=104,ifname=tap.b.gfs1,script=no
-net nic,vlan=96,model=virtio,macaddr=00:ff:48:23:45:4d
-net tap,vlan=96,ifname=tap.f.gfs1,script=no

Since we use KVM mostly for webserver stuff I've run tests with
Apachebench, Graphics Magick and IOzone to simulate our workload as good
as we can.

Apachebench with 4 GB RAM, CFQ scheduler, 1/2/4/8 vProcs compared with
the host (the hp380g6 graph):
http://www.tauceti.net/kvm-benchmarks/merge-3428/
Apachebench with 2 GB RAM, CFQ scheduler, 1/2/4/8 vProcs compared with
the host (the hp380g6 graph):
http://www.tauceti.net/kvm-benchmarks/merge-6053/
I was interested if more RAM will give more throughput and if more
vProcs will scale. As you can see 8 vProcs (compared to 2xQuadCore+HT)
will give almost the same requests per second in KVM as the host alone
can deliver. RAM doesn't matter in this case.

Graphics Magick resize with 4 GB RAM, CFQ scheduler, 1/2/4/8 vProcs
compared with the host (the hp380g6 graph):
http://www.tauceti.net/kvm-benchmarks/merge-5214/
Graphics Magick resize with 2 GB RAM, CFQ scheduler, 1/2/4/8 vProcs
compared with the host (the hp380g6 graph):
http://www.tauceti.net/kvm-benchmarks/merge-7186/
With 8 vProcs the KVM runs about 10% slower. RAM doesn't seem to matter
again here.

The following IO tests run with the KVM option "cache=none":
IOzone write test with 2 GB RAM, CFQ scheduler, datamode=ordered,
1/2/4/8 vProcs compared with the host (the hp380g6 graph):
http://www.tauceti.net/kvm-benchmarks/merge-3564/
IOzone write test with 2 GB RAM, deadline scheduler, datamode=ordered,
1/2/4/8 vProcs compared with the host (the hp380g6 graph):
http://www.tauceti.net/kvm-benchmarks/merge-4533/
It's interesting to see that the deadline scheduler gives you about
10-15 MByte/s more throughput. Please note that I haven't checked the
CPU usage during the IO tests.

The following IO tests run with no "cache" option set so this defaults
to "writethrough":
IOzone write test with 2 GB RAM, deadline scheduler, datamode ordered
and writeback (see testpage), 8 vProcs compared with the hoste (the
hp380g6 graph):
http://www.tauceti.net/kvm-benchmarks/merge-7526/

All in all for this tests it seems that a KVM can perform almost as good
as the host. I haven't tested the network throughput but it works quite
well if I copy big files with ssh and/or rsync but I have no numbers yet.

I'm using KVM for 2 years now and 1 year in production and I'm satisfied
with it. Thanks to all developers for their work! But what is still
confusing to me is the IO thing. I've had two times where new releases
due to a bug destroyed some data but it was not obvious to me that it
was a bug and if it was in KVM (with new kernel release) or Qemu. So
I've decided to start KVM's always with "cache=none" and use ext3 as
filesystem with datamode=ordered. Now the benchmark above shows quite
impressive results with the KVM default option "cache=writethrough" but
due to the things happend to me in the past I really fear very much to
switch from "cache=none" to "cache=writethrough" because of data integrity.

I know that there was some discussion some time ago but can some of the
developers confirm that the following "combination" saves the data
integrity of the guest filesystems and the integrity of the qcow2 image
as a whole in case of a crash e.g.:

Host: Filessystem xfs for KVM qcow2 images
AND Guest: cache=writethrough, virtio block driver, Filesystems ext3
with datamode=ordered
OR  Guest: cache=none, virtio block driver, Filesystems ext3 with
datamode=ordered

I still don't trust ext3/ext4 with datamode=writeback or xfs as guest
filesystems because of more aggressive write caching and the worst thing
that can happen is that you loose data...

Thanks
- Robert

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html