Hello, This bug/patch is only relevant for qemu-0.14.x users, since in qemu-0.15 the main-loop is completely different. So this is more or less for reference only, if others experience the same dead-lock. I have created a snapshot of an VM, which now doesn't load. Thread 1 loads the saved state and calls on_vcpu(), which notifies Thread 2 before going to sleep. Thread 2 waits for "qemu_system_ready" being set, so it isn't yet waiting for qemu_work_cond. qemu_system_ready would only later be set by Thread 1 in kvm_main_loop(). Here's the gdb backtrace: (gdb) thread apply all bt Thread 2 (Thread 0x7fe9bcd89700 (LWP 16607)): #0 0x00007fe9c5ad716c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x000000000043c60a in qemu_cond_wait (_env=0x13e0ec0) at qemu-kvm-0.14.1+dfsg/qemu-kvm.c:1107 #2 ap_main_loop (_env=0x13e0ec0) at qemu-kvm-0.14.1+dfsg/qemu-kvm.c:1460 #3 0x00007fe9c5ad28ba in start_thread () from /lib/libpthread.so.0 #4 0x00007fe9c2cb002d in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 1 (Thread 0x7fe9c62ea760 (LWP 16601)): #0 0x00007fe9c5ad716c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x0000000000438b1f in qemu_cond_wait (env=<value optimized out>, func=<value optimized out>, data=<value optimized out>) at qemu-kvm-0.14.1+dfsg/qemu-kvm.c:1107 #2 on_vcpu (env=<value optimized out>, func=<value optimized out>, data=<value optimized out>) at qemu-kvm-0.14.1+dfsg/qemu-kvm.c:1161 #3 0x000000000057a275 in cpu_synchronize_state (env=0x13e0ec0) at qemu-kvm-0.14.1+dfsg/kvm.h:183 #4 get_pcr_cpu (env=0x13e0ec0) at qemu-kvm-0.14.1+dfsg/kvm-tpr-opt.c:213 #5 kvm_tpr_enable_vapic (env=0x13e0ec0) at qemu-kvm-0.14.1+dfsg/kvm-tpr-opt.c:224 #6 0x00000000004370ae in kvm_arch_load_regs (env=0x13e0ec0, level=3) at ../qemu-kvm-x86.c:605 #7 0x0000000000438fe2 in kvm_cpu_synchronize_post_init (env=0x96de44) at qemu-kvm-0.14.1+dfsg/qemu-kvm.c:1190 #8 0x000000000040ce58 in cpu_synchronize_post_init () at qemu-kvm-0.14.1+dfsg/kvm.h:197 #9 cpu_synchronize_all_post_init () at qemu-kvm-0.14.1+dfsg/cpus.c:98 #10 0x00000000004a5087 in qemu_loadvm_state (f=0x167fa10) at savevm.c:1826 #11 0x00000000004a5281 in load_vmstate (name=<value optimized out>) at savevm.c:2071 #12 0x000000000041bcbc in main (argc=52, argv=<value optimized out>, envp=<value optimized out>) at qemu-kvm-0.14.1+dfsg/vl.c:3187 The VM was started by libvirtd as /usr/bin/kvm -S \ -M pc-0.14 \ -enable-kvm \ -m 512 \ -smp 1,sockets=1,cores=1,threads=1 \ -name winxp-1 -uuid 35dcee44-64bb-4fe0-b75c-afc50fe09c0c \ -nodefconfig -nodefaults \ -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/winxp-1.monitor,server,nowait \ -mon chardev=monitor,mode=readline \ -rtc base=localtime \ -boot dc \ -drive file=/var/lib/libvirt/images/winxp-1-0.qcow2,if=none,id=drive-virtio-disk0,boot=on,format=qcow2 \ -device virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 \ -drive file=/var/lib/libvirt/images/win_xp_pro.iso,if=none,media=cdrom,id=drive-ide0-0-1,readonly=on,format=raw \ -device ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 \ -drive file=/var/lib/libvirt/images/kvm-windows-drivers_\(virtio-1.1.16\).vfd,if=none,id=drive-fdc0-0-0,format=raw \ -global isa-fdc.driveA=drive-fdc0-0-0 \ -netdev tap,fd=46,id=hostnet0 \ -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:8f:53:6f,bus=pci.0,addr=0x3 \ -usb \ -device usb-tablet,id=input0 \ -vnc 0.0.0.0:0 -k de -vga cirrus \ -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 \ -loadvm "Start der Installation XP1" -- Philipp Hahn Open Source Software Engineer hahn@xxxxxxxxxxxxx Univention GmbH Linux for Your Business fon: +49 421 22 232- 0 Mary-Somerville-Str.1 D-28359 Bremen fax: +49 421 22 232-99 http://www.univention.de/
Bug #22877: Fix on_vcpu() deadlock an loadvm Thread 1 loads the saved state and calls on_vcpu(), which notifies Thread 2 before going to sleep. Thread 2 waits for "qemu_system_ready" being set, so it isn't yet waiting for qemu_work_cond. qemu_system_ready would only later be set by Thread 1 in kvm_main_loop(). (gdb) thread apply all bt Thread 2 (Thread 0x7fe9bcd89700 (LWP 16607)): #0 0x00007fe9c5ad716c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x000000000043c60a in qemu_cond_wait (_env=0x13e0ec0) at qemu-kvm-0.14.1+dfsg/qemu-kvm.c:1107 #2 ap_main_loop (_env=0x13e0ec0) at qemu-kvm-0.14.1+dfsg/qemu-kvm.c:1460 #3 0x00007fe9c5ad28ba in start_thread () from /lib/libpthread.so.0 #4 0x00007fe9c2cb002d in clone () from /lib/libc.so.6 #5 0x0000000000000000 in ?? () Thread 1 (Thread 0x7fe9c62ea760 (LWP 16601)): #0 0x00007fe9c5ad716c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 #1 0x0000000000438b1f in qemu_cond_wait (env=<value optimized out>, func=<value optimized out>, data=<value optimized out>) at qemu-kvm-0.14.1+dfsg/qemu-kvm.c:1107 #2 on_vcpu (env=<value optimized out>, func=<value optimized out>, data=<value optimized out>) at qemu-kvm-0.14.1+dfsg/qemu-kvm.c:1161 #3 0x000000000057a275 in cpu_synchronize_state (env=0x13e0ec0) at qemu-kvm-0.14.1+dfsg/kvm.h:183 #4 get_pcr_cpu (env=0x13e0ec0) at qemu-kvm-0.14.1+dfsg/kvm-tpr-opt.c:213 #5 kvm_tpr_enable_vapic (env=0x13e0ec0) at qemu-kvm-0.14.1+dfsg/kvm-tpr-opt.c:224 #6 0x00000000004370ae in kvm_arch_load_regs (env=0x13e0ec0, level=3) at ../qemu-kvm-x86.c:605 #7 0x0000000000438fe2 in kvm_cpu_synchronize_post_init (env=0x96de44) at qemu-kvm-0.14.1+dfsg/qemu-kvm.c:1190 #8 0x000000000040ce58 in cpu_synchronize_post_init () at qemu-kvm-0.14.1+dfsg/kvm.h:197 #9 cpu_synchronize_all_post_init () at qemu-kvm-0.14.1+dfsg/cpus.c:98 #10 0x00000000004a5087 in qemu_loadvm_state (f=0x167fa10) at savevm.c:1826 #11 0x00000000004a5281 in load_vmstate (name=<value optimized out>) at savevm.c:2071 #12 0x000000000041bcbc in main (argc=52, argv=<value optimized out>, envp=<value optimized out>) at qemu-kvm-0.14.1+dfsg/vl.c:3187 Skip on_vcpu() while the system is not yet fully initialized. This patch is only relevant for qemu-0.14.x, because the main event loop is completely rewritten in qemu-0.15 and the implementation using qemu_system_* is replaced by commit fa7d1867578b6a1afc39d4ece8629a1e92baddd7. --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -1174,7 +1174,7 @@ static void do_kvm_cpu_synchronize_state(void *_env) void kvm_cpu_synchronize_state(CPUState *env) { - if (!env->kvm_vcpu_dirty) { + if (qemu_system_ready && !env->kvm_vcpu_dirty) { on_vcpu(env, do_kvm_cpu_synchronize_state, env); } }
Attachment:
signature.asc
Description: This is a digitally signed message part.