Re: qemu-kvm-1.1.0 crashing with kernel 3.5.0-rc6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/26/12 11:01, Avi Kivity wrote:
On 07/26/2012 12:52 PM, Chris Clayton wrote:
On 07/19/12 19:23, Chris Clayton wrote:
On 07/19/12 13:17, Avi Kivity wrote:
On 07/19/2012 03:14 PM, Chris Clayton wrote:

Change of diagnostics, unfortunately. qemu-kvm-1.0.1 can, in fact,
crash
on 3.5.0-rc6 (and rc7). I didn't get it earlier because it takes many
times more invocations before the crash occurs with 1.0.1 and I
haven't
used qemu-kvm much in the past few weeks.

I'm now checking whether I can get crashes (with 1.0.1 and/or
1.1.0) on
linux-3.4.4. I'll report back in a day or two.

I've started up qemu-kvm on kernel 3.4.4 many times and not see a
crash.
That would indicate that the problem is in the kernel. However, I
pulled
the latest and greatest from Linus yesterday evening and I now can't
get
the crash there either, so whatever it was seems to have been fixed. If
I checkout and build 3.5.0-rc[1..7], I can get the crash pretty
quickly,
so it's been fixed in the last few days.

There were no kvm changes post-rc7.

Yes, I'm aware of that, Avi. This thread started because I was getting a
crash in qemu-kvm, which I thought was only in v1.1.0. Later it turned
out the the problem was also present in v1.0.1, but much harder to hit.
However, it only ever happened with 3.5.0 kernels. 3.4.4, with either
version of qemu-kvm, was stable. So then it seemed that the problem was
in the kernel, (but not necessarily in the kvm code).

Something that's changed since rc7 has either fixed the problem or made
it much harder to hit. With rc7 and earlier I can recreate the crash
quite easily with qemu-kvm-1.1.0 and with enough runs of 1.0.1. With
rc7+, I haven't been able to get a crash at all.

Well, I'm getting the crash again, but this time I've managed to get a
backtrace:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb60ffb40 (LWP 9405)]
0xb7803d77 in __strcmp_sse4_2 () from /lib/libc.so.6
(gdb) bt
#0  0xb7803d77 in __strcmp_sse4_2 () from /lib/libc.so.6
#1  0xb7e65333 in g_str_equal () from /usr/lib/libglib-2.0.so.0
#2  0xb7e6458d in g_hash_table_lookup () from /usr/lib/libglib-2.0.so.0
#3  0x8014e2cf in type_table_lookup (name=0x802b0c50 "apic-common") at
qom/object.c:94
#4  type_get_by_name (name=name@entry=0x802b0c50 "apic-common") at
qom/object.c:149
#5  0x8014e933 in object_dynamic_cast (obj=obj@entry=0x80a5d818,
typename=typename@entry=0x802b0c50 "apic-common")
     at qom/object.c:416
#6  0x8014e8b9 in object_dynamic_cast_assert (obj=obj@entry=0x80a5d818,
     typename=typename@entry=0x802b0c50 "apic-common") at qom/object.c:478
#7  0x80193462 in cpu_set_apic_tpr (d=0x80a5d818, val=8 '\b')
     at /home/chris/rpm/BUILD/qemu-kvm-1.1.1/hw/apic_common.c:60
#8  0x801d0560 in kvm_arch_post_run (env=env@entry=0x80a55a60,
run=run@entry=0xb6239000)
     at /home/chris/rpm/BUILD/qemu-kvm-1.1.1/target-i386/kvm.c:1695
#9  0x801cb05f in kvm_cpu_exec (env=env@entry=0x80a55a60) at
/home/chris/rpm/BUILD/qemu-kvm-1.1.1/kvm-all.c:1269
#10 0x80199d1e in qemu_kvm_cpu_thread_fn (arg=0x80a55a60) at
/home/chris/rpm/BUILD/qemu-kvm-1.1.1/cpus.c:752
#11 0xb7a1fd9e in start_thread () from /lib/libpthread.so.0
#12 0xb77bbbbe in clone () from /lib/libc.so.6

This is with kernel 3.5.0 and qemu-kvm-1.1.1. glibc is 2.16.0 built

It looks like general memory corruption.  Is this repeatable?  What's
the guest uptime when it happens (i.e. is it immediate?)

I've just done 10 runs of WinXP SP3 and 5 of them crashed. Three crashed early as XP was starting up - well before the desktop would have appeared. The other two crashed as XP was closing down, having been running for a few minutes (but not doing much).

The error messages seen through dmesg are:

qemu-kvm[12778] general protection ip:b6c43d77 sp:b5e800fc error:0 in libc-2.16.so[b6b06000+1b4000] qemu-kvm[12813] general protection ip:b6bf6d77 sp:b54ff0fc error:0 in libc-2.16.so[b6ab9000+1b4000] qemu-kvm[12986] general protection ip:b6cd3d77 sp:b55ff0fc error:0 in libc-2.16.so[b6b96000+1b4000] qemu-kvm[13045] general protection ip:b6c91d77 sp:b54ff0fc error:0 in libc-2.16.so[b6b54000+1b4000] qemu-kvm[13225] general protection ip:b6c5bd77 sp:b54ff0fc error:0 in libc-2.16.so[b6b1e000+1b4000]

The other 5 were OK, although I only did a bit of web browsing for few minutes with IE.


Jan, why are we calling cpu_set_apic_tpr() with kvm_irqchip_in_kernel?



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux