qemu-kvm hangs if multipath device is queing (was: Re: [Qemu-devel] Qemu-KVM 0.12.3 and Multipath -> Assertion)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Kevin,

here we go. I created a blocking multipath device (interrupted all paths). qemu-kvm hangs with 100% cpu.
also monitor is not responding.

If I restore at least one path, the vm is continueing.

BR,
Peter


^C
Program received signal SIGINT, Interrupt.
0x00007fd8a6aaea94 in __lll_lock_wait () from /lib/libpthread.so.0
(gdb) bt
#0  0x00007fd8a6aaea94 in __lll_lock_wait () from /lib/libpthread.so.0
#1  0x00007fd8a6aaa190 in _L_lock_102 () from /lib/libpthread.so.0
#2  0x00007fd8a6aa9a7e in pthread_mutex_lock () from /lib/libpthread.so.0
#3 0x000000000042e739 in kvm_mutex_lock () at /usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2524 #4 0x000000000042e76e in qemu_mutex_lock_iothread () at /usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2537 #5 0x000000000040c262 in main_loop_wait (timeout=1000) at /usr/src/qemu-kvm-0.12.4/vl.c:3995 #6 0x000000000042dcf1 in kvm_main_loop () at /usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2126
#7  0x000000000040c98c in main_loop () at /usr/src/qemu-kvm-0.12.4/vl.c:4212
#8 0x000000000041054b in main (argc=30, argv=0x7fff266a77e8, envp=0x7fff266a78e0) at /usr/src/qemu-kvm-0.12.4/vl.c:6252
(gdb) bt full
#0  0x00007fd8a6aaea94 in __lll_lock_wait () from /lib/libpthread.so.0
No symbol table info available.
#1  0x00007fd8a6aaa190 in _L_lock_102 () from /lib/libpthread.so.0
No symbol table info available.
#2  0x00007fd8a6aa9a7e in pthread_mutex_lock () from /lib/libpthread.so.0
No symbol table info available.
#3 0x000000000042e739 in kvm_mutex_lock () at /usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2524
No locals.
#4 0x000000000042e76e in qemu_mutex_lock_iothread () at /usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2537
No locals.
#5 0x000000000040c262 in main_loop_wait (timeout=1000) at /usr/src/qemu-kvm-0.12.4/vl.c:3995
   ioh = (IOHandlerRecord *) 0x0
   rfds = {fds_bits = {1048576, 0 <repeats 15 times>}}
   wfds = {fds_bits = {0 <repeats 16 times>}}
   xfds = {fds_bits = {0 <repeats 16 times>}}
   ret = 1
   nfds = 21
   tv = {tv_sec = 0, tv_usec = 999761}
#6 0x000000000042dcf1 in kvm_main_loop () at /usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2126
   fds = {18, 19}
   mask = {__val = {268443712, 0 <repeats 15 times>}}
   sigfd = 20
#7  0x000000000040c98c in main_loop () at /usr/src/qemu-kvm-0.12.4/vl.c:4212
   r = 0
#8 0x000000000041054b in main (argc=30, argv=0x7fff266a77e8, envp=0x7fff266a78e0) at /usr/src/qemu-kvm-0.12.4/vl.c:6252
   gdbstub_dev = 0x0
   boot_devices_bitmap = 12
   i = 0
   snapshot = 0
   linux_boot = 0
   initrd_filename = 0x0
   kernel_filename = 0x0
   kernel_cmdline = 0x588fac ""
   boot_devices = "dc", '\0' <repeats 30 times>
   ds = (DisplayState *) 0x198bf00
   dcl = (DisplayChangeListener *) 0x0
   cyls = 0
   heads = 0
   secs = 0
   translation = 0
   hda_opts = (QemuOpts *) 0x0
   opts = (QemuOpts *) 0x1957390
   optind = 30
---Type <return> to continue, or q <return> to quit---
   r = 0x7fff266a8a23 "-usbdevice"
   optarg = 0x7fff266a8a2e "tablet"
   loadvm = 0x0
   machine = (QEMUMachine *) 0x861720
cpu_model = 0x7fff266a8917 "qemu64,model_id=Intel(R) Xeon(R) CPU", ' ' <repeats 11 times>, "E5520 @ 2.27GHz"
   fds = {644511720, 32767}
   tb_size = 0
   pid_file = 0x7fff266a89bb "/var/run/qemu/vm-150.pid"
   incoming = 0x0
   fd = 0
   pwd = (struct passwd *) 0x0
   chroot_dir = 0x0
   run_as = 0x0
   env = (struct CPUX86State *) 0x0
   show_vnc_port = 0
   params = {0x58cc76 "order", 0x58cc7c "once", 0x58cc81 "menu", 0x0}

Kevin Wolf wrote:
Am 04.05.2010 15:42, schrieb Peter Lieven:
hi kevin,

you did it *g*

looks promising. applied this patched and was not able to reproduce yet :-)

secure way to reproduce was to shut down all multipath paths, then initiate i/o in the vm (e.g. start an application). of course, everything hangs at this point.

after reenabling one path, vm crashed. now it seems to behave correctly and
just report an DMA timeout and continues normally afterwards.

Great, I'm going to submit it as a proper patch then.

Christoph, by now I'm pretty sure it's right, but can you have another
look if this is correct, anyway?

can you imagine of any way preventing the vm to consume 100% cpu in
that waiting state?
my current approach is to run all vms with nice 1, which helped to keep the
machine responsible if all vms (in my test case 64 on a box) have hanging
i/o at the same time.

I don't have anything particular in mind, but you could just attach gdb
and get another backtrace while it consumes 100% CPU (you'll need to use
"thread apply all bt" to catch everything). Then we should see where
it's hanging.

Kevin





--
Mit freundlichen Grüßen/Kind Regards

Peter Lieven

..........................................................................................................

  KAMP Netzwerkdienste GmbH
  Vestische Str. 89-91 | 46117 Oberhausen
  Tel: +49 (0) 208.89 402-50 | Fax: +49 (0) 208.89 402-40
  mailto:pl@xxxxxxx | http://www.kamp.de

  Geschäftsführer: Heiner Lante | Michael Lante
  Amtsgericht Duisburg | HRB Nr. 12154
  USt-Id-Nr.: DE 120607556

.........................................................................................................
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux