On 2011-02-26 12:43, xming wrote: > When trying to start X (and it loads qxl driver) the kvm process just crashes. > > qemu-kvm 0.14 > > startup line > > /usr/bin/kvm -name spaceball,process=spaceball -m 1024 -kernel > /boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev > type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device > virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive > file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice > port=5957,disable-ticketing -monitor > telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile > /var/run/kvm/spaceball.pid > > host is running vanilla 2.6.37.1 on amd64. > > Here is the bt > > # gdb /usr/bin/qemu-system-x86_64 > GNU gdb (Gentoo 7.2 p1) 7.2 > Copyright (C) 2010 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-pc-linux-gnu". > For bug reporting instructions, please see: > <http://bugs.gentoo.org/>... > Reading symbols from /usr/bin/qemu-system-x86_64...done. > (gdb) set args -name spaceball,process=spaceball -m 1024 -kernel > /boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev > type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device > virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive > file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice > port=5957,disable-ticketing -monitor > telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile > /var/run/kvm/spaceball.pid > (gdb) run > Starting program: /usr/bin/qemu-system-x86_64 -name > spaceball,process=spaceball -m 1024 -kernel > /boot/bzImage-2.6.37.2-guest -append "root=/dev/vda ro" -smp 1 -netdev > type=tap,id=spaceball0,script=kvm-ifup-brloc,vhost=on -device > virtio-net-pci,netdev=spaceball0,mac=00:16:3e:00:08:01 -drive > file=/dev/volume01/G-spaceball,if=virtio -vga qxl -spice > port=5957,disable-ticketing -monitor > telnet:192.168.0.254:10007,server,nowait,nodelay -pidfile > /var/run/kvm/spaceball.pid > [Thread debugging using libthread_db enabled] > do_spice_init: starting 0.6.0 > spice_server_add_interface: SPICE_INTERFACE_KEYBOARD > spice_server_add_interface: SPICE_INTERFACE_MOUSE > [New Thread 0x7ffff4802710 (LWP 30294)] > spice_server_add_interface: SPICE_INTERFACE_QXL > [New Thread 0x7fffaacae710 (LWP 30295)] > red_worker_main: begin > handle_dev_destroy_surfaces: > handle_dev_destroy_surfaces: > handle_dev_input: start > [New Thread 0x7fffaa4ad710 (LWP 30298)] > [New Thread 0x7fffa9cac710 (LWP 30299)] > [New Thread 0x7fffa94ab710 (LWP 30300)] > [New Thread 0x7fffa8caa710 (LWP 30301)] > [New Thread 0x7fffa3fff710 (LWP 30302)] > [New Thread 0x7fffa37fe710 (LWP 30303)] > [New Thread 0x7fffa2ffd710 (LWP 30304)] > [New Thread 0x7fffa27fc710 (LWP 30305)] > [New Thread 0x7fffa1ffb710 (LWP 30306)] > [New Thread 0x7fffa17fa710 (LWP 30307)] > reds_handle_main_link: > reds_show_new_channel: channel 1:0, connected successfully, over Non Secure link > reds_main_handle_message: net test: latency 5.636000 ms, bitrate > 11027768 bps (10.516899 Mbps) > reds_show_new_channel: channel 2:0, connected successfully, over Non Secure link > red_dispatcher_set_peer: > handle_dev_input: connect > handle_new_display_channel: jpeg disabled > handle_new_display_channel: zlib-over-glz disabled > reds_show_new_channel: channel 4:0, connected successfully, over Non Secure link > red_dispatcher_set_cursor_peer: > handle_dev_input: cursor connect > reds_show_new_channel: channel 3:0, connected successfully, over Non Secure link > inputs_link: > [New Thread 0x7fffa07f8710 (LWP 30312)] > [New Thread 0x7fff9fff7710 (LWP 30313)] > [New Thread 0x7fff9f7f6710 (LWP 30314)] > [New Thread 0x7fff9eff5710 (LWP 30315)] > [New Thread 0x7fff9e7f4710 (LWP 30316)] > [New Thread 0x7fff9dff3710 (LWP 30317)] > [New Thread 0x7fff9d7f2710 (LWP 30318)] > qemu-system-x86_64: > /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724: > kvm_mutex_unlock: Assertion `!cpu_single_env' failed. > > Program received signal SIGABRT, Aborted. > [Switching to Thread 0x7ffff4802710 (LWP 30294)] > 0x00007ffff5daa165 in raise () from /lib/libc.so.6 > (gdb) > (gdb) > (gdb) > (gdb) > (gdb) bt > #0 0x00007ffff5daa165 in raise () from /lib/libc.so.6 > #1 0x00007ffff5dab580 in abort () from /lib/libc.so.6 > #2 0x00007ffff5da3201 in __assert_fail () from /lib/libc.so.6 > #3 0x0000000000436f7e in kvm_mutex_unlock () > at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1724 > #4 qemu_mutex_unlock_iothread () > at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1737 > #5 0x00000000005e84ee in qxl_hard_reset (d=0x15d3080, loadvm=0) > at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/hw/qxl.c:665 > #6 0x00000000005e9f9a in ioport_write (opaque=0x15d3080, addr=<value > optimized out>, val=0) > at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/hw/qxl.c:979 > #7 0x0000000000439d4e in kvm_handle_io (env=0x11a3e00) > at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/kvm-all.c:818 > #8 kvm_run (env=0x11a3e00) > at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:617 > #9 0x0000000000439f79 in kvm_cpu_exec (env=0x764b) > at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1233 > #10 0x000000000043b2d7 in kvm_main_loop_cpu (_env=0x11a3e00) > at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1419 > #11 ap_main_loop (_env=0x11a3e00) > at /var/tmp/portage/app-emulation/qemu-kvm-0.14.0/work/qemu-kvm-0.14.0/qemu-kvm.c:1466 > #12 0x00007ffff77bb944 in start_thread () from /lib/libpthread.so.0 > #13 0x00007ffff5e491dd in clone () from /lib/libc.so.6 > (gdb) That's a spice bug. In fact, there are a lot of qemu_mutex_lock/unlock_iothread in that subsystem. I bet at least a few of them can cause even more subtle problems. Two general issues with dropping the global mutex like this: - The caller of mutex_unlock is responsible for maintaining cpu_single_env across the unlocked phase (that's related to the abort above). - Dropping the lock in the middle of a callback is risky. That may enable re-entrances of code sections that weren't designed for this (I'm skeptic about the side effects of qemu_spice_vm_change_state_handler - why dropping the lock here?). Spice requires a careful review regarding such issues. Or it should pioneer with introducing its own lock so that we can handle at least related I/O activities over the VCPUs without holding the global mutex (but I bet it's not the simplest candidate for such a new scheme). Jan
Attachment:
signature.asc
Description: OpenPGP digital signature