qemu-kvm segfaults in qemu_del_timer (0.10.5 and 0.10.6)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have a couple of clusters hosting qemu-kvm virtual machines. One of these
clusters consists of dual quad-core Xeon E5420s (vmx), the other consists of
dual quad-core Barcelona Opterons (svm), and both are running x86-64 Linux
2.6.30.4 with the kvm modules included with the upstream kernel compiled in.

Running qemu-kvm 0.10.5, I was seeing occasional segfaults from the virtual
machines, perhaps two or three a day across each cluster. The guest OS didn't
appear to be a factor, as both Linux and Windows VMs have crashed. I then
switched to the recently released qemu-kvm 0.10.6, and am still seeing these
segfaults.

It's very hard for me to arrange for core dumps on these live clusters, and the
segfaults are hard to reproduce on test machines because they are rare.
However, I have unstripped copies of the respective binaries and have used gdb
to translate the segfault ip into a source file and line number, which I hope
might be useful. On both clusters and for each version of qemu-kvm, segfaults
are happening at lines #1161 and #1163 of vl.c:

                [...]
                /* stop a timer, but do not dealloc it */
                void qemu_del_timer(QEMUTimer *ts)
                {
                    QEMUTimer **pt, *t;

                    /* NOTE: this code must be signal safe because
                       qemu_timer_expired() can be called from a signal. */
    HERE ==>        pt = &active_timers[ts->clock->type];
                    for(;;) {
    HERE ==>            t = *pt;
                        if (!t)
                            break;
                        if (t == ts) {
                            *pt = t->next;
                            break;
                        }
                        pt = &t->next;
                    }
                }
                [...]

For qemu-kvm 0.10.5, I have large numbers of segfaults in both locations. For
qemu-kvm 0.10.6, my sample is much smaller, but the segfaults I have are all at
line #1161, not #1163.

Final data-point: prior to the 0.10.5 upgrade, we had been successfully running a
(fairly old) kvm-83 userspace without experiencing this segfault problem.

Any help fixing this would be gratefully received!

Cheers,

Chris.

PS One other place I have seen a segfault in 0.10.6 since we rolled it out is
at line #141 of hw/scsi-disk.c, but this has only happened once---very rare
compared to the problem I describe above.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux