Re: KVM performance vs. Xen

Andrew Theurer <habanero@xxxxxxxxxxxxxxxxxx> · Thu, 30 Apr 2009 07:49:46 -0500

Avi Kivity wrote:
Andrew Theurer wrote:
I wanted to share some performance data for KVM and Xen.  I thought it
would be interesting to share some performance results especially
compared to Xen, using a more complex situation like heterogeneous
server consolidation.

The Workload:
The workload is one that simulates a consolidation of servers on to a
single host.  There are 3 server types: web, imap, and app (j2ee).  In
addition, there are other "helper" servers which are also consolidated:
a db server, which helps out with the app server, and an nfs server,
which helps out with the web server (a portion of the docroot is nfs
mounted).  There is also one other server that is simply idle.  All 6
servers make up one set.  The first 3 server types are sent requests,
which in turn may send requests to the db and nfs helper servers.  The
request rate is throttled to produce a fixed amount of work.  In order
to increase utilization on the host, more sets of these servers are
used.  The clients which send requests also have a response time
requirement which is monitored.  The following results have passed the
response time requirements.

What's the typical I/O load (disk and network bandwidth) while the 
tests are running?
This is average thrgoughput:
network:    Tx: 79 MB/sec  Rx: 5 MB/sec
disk:    read: 17 MB/sec  write: 40 MB/sec

The host hardware:
A 2 socket, 8 core Nehalem with SMT, and EPT enabled, lots of disks, 4 x
1 GB Ethenret

CPU time measurements with SMT can vary wildly if the system is not 
fully loaded.  If the scheduler happens to schedule two threads on a 
single core, both of these threads will generate less work compared to 
if they were scheduled on different cores.
Understood.  Even if at low loads, the scheduler does the right thing 
and spreads out to all the cores first, once it goes beyond 50% util, 
the CPU util can climb at a much higher rate (compared to a linear 
increase in work) because it then starts scheduling 2 threads per core, 
and each thread can do less work.  I have always wanted something which 
could more accurately show the utilization of a processor core, but I 
guess we have to use what we have today.  I will run again with SMT off 
to see what we get.

Test Results:
The throughput is equal in these tests, as the clients throttle the work
(this is assuming you don't run out of a resource on the host).  What's
telling is the CPU used to do the same amount of work:

Xen:  52.85%
KVM:  66.93%

So, KVM requires 66.93/52.85 = 26.6% more CPU to do the same amount of
work. Here's the breakdown:

total    user    nice  system     irq softirq   guest
66.90    7.20    0.00   12.94    0.35    3.39   43.02

Comparing guest time to all other busy time, that's a 23.88/43.02 = 55%
overhead for virtualization.  I certainly don't expect it to be 0, but
55% seems a bit high.  So, what's the reason for this overhead?  At the
bottom is oprofile output of top functions for KVM.  Some observations:

1) I'm seeing about 2.3% in scheduler functions [that I recognize].
Does that seems a bit excessive?

Yes, it is.  If there is a lot of I/O, this might be due to the thread 
pool used for I/O.
I have a older patch which makes a small change to posix_aio_thread.c by 
trying to keep the thread pool size a bit lower than it is today.  I 
will dust that off and see if it helps.

2) cpu_physical_memory_rw due to not using preadv/pwritev?

I think both virtio-net and virtio-blk use memcpy().

3) vmx_[save|load]_host_state: I take it this is from guest switches?

These are called when you context-switch from a guest, and, much more 
frequently, when you enter qemu.

We have 180,000 context switches a second.  Is this more than expected?

Way more.  Across 16 logical cpus, this is >10,000 cs/sec/cpu.

I wonder if schedstats can show why we context switch (need to let
someone else run, yielded, waiting on io, etc).

Yes, there is a scheduler tracer, though I have no idea how to operate 
it.

Do you have kvm_stat logs?
Sorry, I don't, but I'll run that next time.  BTW, I did not notice a 
batch/log mode the last time I ram kvm_stat.  Or maybe it was not 
obvious to me.  Is there an ideal way to run kvm_stat without a curses 
like output?

-Andrew

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html