Re: troubleshoot live migration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Found a workaround.

I don't know why I disabled hpet when the issue was lapic, late
nights. I can't reproduce the issue if I disable kvmclock on the qemu
cmdline, or if I leave the vm at the grub prompt rather than loading
the guest OS.  It kind of sucks to disable kvmclock but at least
migration works again now. I have no idea whether it's the guest KVM
drivers, something in qemu, or the host (probably an incompatibility
in the combination)  but it happens with CentOS 6.5 guests and fully
patched Ubuntu 12.04 guests.

Here's our test environment in case anyone cares to try to reproduce.

2 x intel E5-2650 KVM servers running linux 3.10.26 and qemu 1.6.x or 1.7.0
Start a virtual machine with CentOS 6.5 (or perhaps any common,
current distro) without disabling kvmclock. I tried single, dual, quad
vcpus.
Let the vm run for a few hours (migrations immediately after start or
even 30 minutes later seem to have no problem)
Migrate, and the vm should become unusable. You may be lucky enough to
get console/serial output complaining about lapic, or it may just
consume cpu

On Wed, Jan 15, 2014 at 7:23 AM, Marcus Sorensen <shadowsor@xxxxxxxxx> wrote:
> Just an update, I found that with different tools I was able to see a
> repeating 'lapic increasing min_delta_ns' scrolling furiously. I've
> added -no-hpet to the cmdline, but was still able to replicate it.
>
> On Tue, Jan 14, 2014 at 1:36 PM, Marcus Sorensen <shadowsor@xxxxxxxxx> wrote:
>> Does anyone have tips on troubleshooting live migration? I'm not sure
>> if this should be a qemu question or a kvm one. I've got several
>> E5-2650 servers running in test environment, kernel 3.10.26 and qemu
>> 1.7.0. If I start a VM guest (say ubuntu, debian, or centos), I can
>> migrate it around from host to host to host just fine, but if I wait
>> awhile (say 1 hour), I try to migrate and it succeeds but the guest is
>> hosed. No longer pings, cpu is thrashing. I've tried to strace it and
>> don't see anything that other working hosts aren't doing, and I've
>> tried gdb but I'm not entirely sure what I'm doing. I tried
>> downgrading to qemu 1.6.1. I've found dozens of reports of such
>> behavior, but they're all due to other things (migrating between
>> different host CPUs, someone thinking it's virtio or memballoon only
>> to later find a fix like changing machine type, etc). I'm at a loss.
>> I've tried replacing the virtio disk/network with sata/e1000, no
>> difference. I'm just trying to figure out where to go from here.
>>
>> I've got a core dump of the qemu process that was spinning, and I can
>> reproduce it fairly easily. Here's an example qemu cmdline:
>>
>> /usr/bin/qemu-system-x86_64 -machine accel=kvm -name VM12 -S -machine
>> pc-i440fx-1.7,accel=kvm,usb=off -m 512 -realtime mlock=off -smp
>> 1,sockets=1,cores=1,threads=1 -uuid
>> dd10a210-ab41-4cc6-a8f2-51113dd39515 -no-user-config -nodefaults
>> -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/VM12.monitor,server,nowait
>> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
>> -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
>> -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive
>> file=/dev/sdd,if=none,id=drive-virtio-disk0,format=raw,cache=none
>> -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2
>> -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,cache=none
>> -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1
>> -netdev tap,fd=29,id=hostnet0,vhost=on,vhostfd=31 -device
>> virtio-net-pci,netdev=hostnet0,id=net0,mac=02:00:39:28:00:01,bus=pci.0,addr=0x3
>> -chardev pty,id=charserial0 -device
>> isa-serial,chardev=charserial0,id=serial0 -chardev
>> socket,id=charchannel0,path=/var/lib/libvirt/qemu/VM12.agent,server,nowait
>> -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=VM12.vport
>> -device usb-tablet,id=input0 -vnc 0.0.0.0:1 -device
>> cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device
>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux