Re: segmentation fault in qemu-kvm-0.14.0

Jan Kiszka <jan.kiszka@xxxxxx> · Wed, 09 Mar 2011 08:37:48 +0100

On 2011-03-08 23:53, Peter Lieven wrote:
> Hi,
> 
> during testing of qemu-kvm-0.14.0 i can reproduce the following segfault. i have seen similar crash already in 0.13.0, but had no time to debug.
> my guess is that this segfault is related to the threaded vnc server which was introduced in qemu 0.13.0. the bug is only triggerable if a vnc
> client is attached. it might also be connected to a resolution change in the guest. i have a backtrace attached. the debugger is still running if someone
> needs more output
> 

...

> Thread 1 (Thread 0x7ffff7ff0700 (LWP 29038)):
> #0  0x0000000000000000 in ?? ()
> No symbol table info available.
> #1  0x000000000041d669 in main_loop_wait (nonblocking=0)
>     at /usr/src/qemu-kvm-0.14.0/vl.c:1388

So we are calling a IOHandlerRecord::fd_write handler that is NULL.
Looking at qemu_set_fd_handler2, this may happen if that function is
called for an existing io-handler entry with non-NULL write handler,
passing a NULL write and a non-NULL read handler. And all this without
the global mutex held.

And there are actually calls in vnc_client_write_plain and
vnc_client_write_locked (in contrast to vnc_write) that may generate
this pattern. It's probably worth validating that the iothread lock is
always held when qemu_set_fd_handler2 is invoked to confirm this race
theory, adding something like

assert(pthread_mutex_trylock(&qemu_mutex) != 0);
(that's for qemu-kvm only)

BTW, qemu with just --enable-vnc-thread, ie. without io-thread support,
should always run into this race as it then definitely lacks a global mutex.

Jan

Attachment:
signature.asc

Description: OpenPGP digital signature