Hi, I'm looking for advice: we have KVM guests whose processes (and we suspect the entire guest) seem to be slow to wake up. We'd like to characterise this and and track it down to a cause. Historically we used VMware Server. This worked adequately on older hosts, but the latest batch of hosts we acquired had these problems of intermittently hanging for seconds or more. They seemed to be related to disk IO time-out errors in the guests' vmware.log, and these could push the guests' timekeeping beyond ntpd's ability to recover. We tried both SCSI and IDE virtual disk controllers, but both seemed to exhibit this problem. We have tried but cannot find any evidence of hardware problems with the hosts we using. (They are 1&1 root-servers, see [2] below for details). We tried changing hosts and replacing hardware disks, this doesn't seem to have helped. Which suggests problems in the guests, but we are not sure what. We never quite got to the bottom of it, concluding that VMware Server wasn't a good platform to be using anyway, and converted to KVM. However, we are seeing similar symptoms in KVM guests, in that an intermittently sleeping process can sometimes sleep for far longer than you'd think reasonable (see [1] for details). I cannot yet claim to have tried everything I can think of, it's a slow process. I'm still scouring the internet for information; we are also trying to find alternative hosting to compare with. However it seems reasonable to ask: - What can cause a guest's process to sleep too long (with KVM on CentOS 5, host and guest), besides being in IO-wait? - What might cause a whole guest to hang? - How can these causes be distinguished? - What is the recommended way (for non kernel-hackers) to profile CentOS 5 systems, in order to pinpoint IO bottlenecks, or other scheduling problems? We don't currently have the option of upgrading to CentOS 6, and would prefer to use stock kernels in production. This may mean some of the currently-used tools can't be applied, but it's hard to know which. Details follow. Thanks, Nick -- [1] The symptoms in more detail Occasional hangs whilst logged into a guest's console, sometimes minutes long. This may have improved slightly after a switch from IDE emulation to virtio_blk. The main (and undoubtedly flawed) metric we have is from running a loop like this: while true; do date; sleep 1; done >dateloop A histogram of the delays between each date in the dateloop file can be made. The guest's contains a "long tail" of big delays, the peak of which can be tens of seconds or even several minutes in a few cases. These seem to correlate with our observed console hangs. For example (guest spec as below, with virtio_blk): starts: Mon Aug 1 09:01:46 2011 GMT ends: Thu Aug 11 16:47:40 2011 GMT largest delay: 69 seconds <= 1 second : 859275 <= 2 seconds: 13238 <= 4 seconds: 538 <= 8 seconds: 407 <= 16 seconds: 118 <= 32 seconds: 14 <= 64 seconds: 8 <= 128 seconds: 1 In comparison, with the same test over approximately the same period on the physical host running the guests, the longest delay we've seen is about 3 seconds (8 on some other, heavily loaded hosts). starts: Mon Aug 1 12:33:45 2011 GMT ends: Thu Aug 11 16:53:06 2011 GMT largest delay: 3 seconds <= 1 second : 867643 <= 2 seconds: 5950 <= 4 seconds: 6 Neither host nor guest is used in production yet, so they are not under heavy load. According to "sar", the long delays mostly correlate with the peaks in IO wait of between 15-40%, apparently triggered by the (uncustomised) nightly cron jobs which run at 4am. -- [2] specifications The host is a 1&1 root-server with 8GB RAM and a 64 bit Phenom(tm) II X4 910e: http://order.1and1.co.uk/xml/order/ServerPremiumQuadCoreXL Sofware is: - x86_64 version of CentOS 5.6 with a stock 2.6.18-289.9.1 kernel - kvm-83-224 and libvirt-0.8.2-15 packages from CentOS base repository. The current guest config: - CentOS 5.6 i386, kernel 2.6.18-194.26.1 - qcow2 disk images with cache="none" (in libvirt domain definition), - virtio disks with the cfq IO-scheduler - the acpi_pm clocksource. Guest timekeeping seems reliable; ntpd is syncing happily; and the guests are mostly idle. cfq and acpi_m were not explicitly chosen, just what we seem to have ended up with. I'm a bit puzzled why is the cfq IO scheduler is the default for virtio, despite seeing noop being recommended. Things I plan to try later: - storage in raw disk images - storage in LVM block devices - noop IO scheduler - kvm-clock -- [2] Disk IO benchmarks The best of a number of Bonnie++ runs in the guest gets ~30MB/s block output, and ~6MB/s block input (they vary quite a lot, and can be significantly worse than this). So the guest's block-out seems to be about a quarter the host is capable of; its block-in is more like a tenth. Presumably it should be possible to do much better than that? The cases in the bonnie++ measurements below are as follows. Host as above with: a) a fresh CentOS 5.6 install, no hypervisors b) with VMware Server v1 and two guests c) with KVM (kvm-83-224 and libvirt-0.8.2-15 from the CentOS base repo) and two guests Guest as above with: d) with IDE disks e) with virtio disks (Note, the date-loop histograms above were obtained from case c) and e).) ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP a) host (bare) 16G 161 99 105132 64 35219 40 349 99 123116 82 182.5 56 b) host+VMware 16G 166 99 110981 67 35709 41 352 99 122644 81 181.5 60 c) host+KVM 16G 161 99 81498 53 26595 30 322 99 85870 54 67.2 16 d) guest+ide 2G 179 99 28083 8 4705 1 1465 92 11106 2 157.1 7 e) guest+virtio 2G 182 99 30280 11 4755 2 1154 85 6449 1 273.3 9 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html