Thank you for your response Stefan. On 02/17/2012 04:30 AM, Stefan Hajnoczi wrote: > On Fri, Feb 17, 2012 at 4:57 AM, Pete Ashdown <pashdown@xxxxxxxxxxxx> wrote: >> I've been waiting for some response from the Ubuntu team regarding a bug on >> launchpad, but it appears that it isn't being taken seriously: >> >> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/745785 > This looks interesting. Let me try to summarize, please point out if > I get something wrong: > > You have software RAID1 on the host, your disk images live on this > device. Whenever checkarray runs on the host you find that VMs become > unresponsive. Guests print warnings that a task is blocked for more > than 120 seconds. Guests become unresponsive on the network. In my case, it is drbd+RAID10, but the bug still applies. It isn't whenever checkarray runs, but whenever checkarray decides to do a resync, it will block all IO somewhere before the end of the resync. Then yes, it isn't long before the guests start to fail due to their inability to read/write. > The fact that the QEMU monitor and VNC still work mean that QEMU is > not probably still running the VM. I think the guest kernel is upset, > perhaps QEMU needs to do something to help these I/Os along. Note that *ALL* IO is blocked, even on the host kernel. It has trouble rebooting at that point too. I have to power cycle it. > Please post your qemu-kvm command-line or libvirt domain XML. > <domain type='kvm'> <name>guestname</name> <uuid>c4cb4999-0713-dffa-32f8-1bb7278b3f5c</uuid> <memory>8388608</memory> <currentMemory>524288</currentMemory> <vcpu>1</vcpu> <os> <type arch='x86_64' machine='pc-0.12'>hvm</type> <boot dev='hd'/> </os> <features> <acpi/> <apic/> <pae/> </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/bin/kvm</emulator> <disk type='block' device='disk'> <driver name='qemu' type='raw'/> <source dev='/dev/vm/guestname'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> <interface type='bridge'> <mac address='52:54:00:ca:7b:70'/> <source bridge='br30'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <input type='mouse' bus='ps2'/> <graphics type='vnc' port='-1' autoport='yes'/> <video> <model type='cirrus' vram='9216' heads='1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </memballoon> </devices> </domain> > What's the easiest way to reproduce this? Based on the Launchpad bug, Ubuntu 11.04 with default packages + RAID1/RAID10 I think would have the issue eventually. I don't think it has to be a particularly IO intensive system. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html