oversleeping guests/processes

Nick <oinksocket@xxxxxxxxxxxxxxx> · Thu, 18 Aug 2011 15:51:02 +0100

Hi,

I'm looking for advice: we have KVM guests whose processes (and we suspect the
entire guest) seem to be slow to wake up.  We'd like to characterise this and
and track it down to a cause.

Historically we used VMware Server.  This worked adequately on older hosts, but
the latest batch of hosts we acquired had these problems of intermittently
hanging for seconds or more.  They seemed to be related to disk
IO time-out errors in the guests' vmware.log, and these could push the guests'
timekeeping beyond ntpd's ability to recover.  We tried both SCSI and IDE
virtual disk controllers, but both seemed to exhibit this problem.  We have
tried but cannot find any evidence of hardware problems with the hosts we using.
 (They are 1&1 root-servers, see [2] below for details).  We tried changing
hosts and replacing hardware disks, this doesn't seem to have helped.

Which suggests problems in the guests, but we are not sure what. We never quite
got to the bottom of it, concluding that VMware Server wasn't a good platform to
be using anyway, and converted to KVM.

However, we are seeing similar symptoms in KVM guests, in that an intermittently
sleeping process can sometimes sleep for far longer than you'd think reasonable
(see [1] for details).

I cannot yet claim to have tried everything I can think of, it's a slow process.
I'm still scouring the internet for information; we are also trying to find
alternative hosting to compare with.

However it seems reasonable to ask:

 - What can cause a guest's process to sleep too long (with KVM on CentOS 5,
   host and guest), besides being in IO-wait?

 - What might cause a whole guest to hang?

 - How can these causes be distinguished?

 - What is the recommended way (for non kernel-hackers) to profile CentOS 5
   systems, in order to pinpoint IO bottlenecks, or other scheduling problems?

We don't currently have the option of upgrading to CentOS 6, and would prefer to
use stock kernels in production.  This may mean some of the currently-used tools
can't be applied, but it's hard to know which.

Details follow.

Thanks,

Nick

-- [1] The symptoms in more detail

Occasional hangs whilst logged into a guest's console, sometimes minutes long.
This may have improved slightly after a switch from IDE emulation to virtio_blk.

The main (and undoubtedly flawed) metric we have is from running a loop like this:

    while true; do date; sleep 1; done >dateloop

A histogram of the delays between each date in the dateloop file can be made.
The guest's contains a "long tail" of big delays, the peak of which can be tens
of seconds or even several minutes in a few cases.  These seem to correlate with
our observed console hangs.

For example (guest spec as below, with virtio_blk):

starts: Mon Aug  1 09:01:46 2011 GMT
ends:   Thu Aug 11 16:47:40 2011 GMT
largest delay: 69 seconds
 <=   1 second : 859275
 <=   2 seconds: 13238
 <=   4 seconds: 538
 <=   8 seconds: 407
 <=  16 seconds: 118
 <=  32 seconds: 14
 <=  64 seconds: 8
 <= 128 seconds: 1

In comparison, with the same test over approximately the same period on the
physical host running the guests, the longest delay we've seen is about 3
seconds (8 on some other, heavily loaded hosts).

starts: Mon Aug  1 12:33:45 2011 GMT
ends:   Thu Aug 11 16:53:06 2011 GMT
largest delay: 3 seconds
 <=   1 second : 867643
 <=   2 seconds: 5950
 <=   4 seconds: 6

Neither host nor guest is used in production yet, so they are not under heavy
load. According to "sar", the long delays mostly correlate with the peaks in IO
wait of between 15-40%, apparently triggered by the (uncustomised) nightly cron
jobs which run at 4am.

-- [2] specifications

The host is a 1&1 root-server with 8GB RAM and a 64 bit Phenom(tm) II X4 910e:

  http://order.1and1.co.uk/xml/order/ServerPremiumQuadCoreXL

Sofware is:
 - x86_64 version of CentOS 5.6 with a stock 2.6.18-289.9.1 kernel
 - kvm-83-224 and libvirt-0.8.2-15 packages from CentOS base repository.

The current guest config:
 - CentOS 5.6 i386, kernel 2.6.18-194.26.1
 - qcow2 disk images with cache="none" (in libvirt domain definition),
 - virtio disks with the cfq IO-scheduler
 - the acpi_pm clocksource.

Guest timekeeping seems reliable; ntpd is syncing happily; and the guests are
mostly idle.  cfq and acpi_m were not explicitly chosen, just what we seem to
have ended up with.

I'm a bit puzzled why is the cfq IO scheduler is the default for virtio, despite
seeing noop being recommended.

Things I plan to try later:

 - storage in raw disk images
 - storage in LVM block devices
 - noop IO scheduler
 - kvm-clock

-- [2] Disk IO benchmarks

The best of a number of Bonnie++ runs in the guest gets ~30MB/s block output,
and ~6MB/s block input (they vary quite a lot, and can be significantly worse
than this). So the guest's block-out seems to be about a quarter the host is
capable of; its block-in is more like a tenth.

Presumably it should be possible to do much better than that?

The cases in the bonnie++ measurements below are as follows.

Host as above with:
 a) a fresh CentOS 5.6 install, no hypervisors
 b) with VMware Server v1 and two guests
 c) with KVM (kvm-83-224 and libvirt-0.8.2-15 from the CentOS base repo)
    and two guests

Guest as above with:
 d) with IDE disks
 e) with virtio disks

(Note, the date-loop histograms above were obtained from case c) and e).)

                    ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
               Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
a) host (bare)  16G   161  99 105132 64 35219  40  349  99 123116  82  182.5 56
b) host+VMware  16G   166  99 110981 67 35709  41  352  99 122644  81  181.5 60
c) host+KVM     16G   161  99  81498 53 26595  30  322  99  85870  54   67.2 16
d) guest+ide     2G   179  99  28083  8  4705   1 1465  92  11106   2  157.1  7
e) guest+virtio  2G   182  99  30280 11  4755   2 1154  85   6449   1  273.3  9

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html