* Ryan Harper <ryanh@xxxxxxxxxx> [2011-05-03 16:57]: > I've encountered an interesting scenario: > > 1. define a guest via virsh define <xml> > 2 start this guest via virsh > 3. one of the disk elements is a multipath device that is currently > misconfigured such that any io to the device hangs the calling process > 4. libvirt times out when attemping to communicate via the monitor to > the guest (btw, this timeout isn't configurable AFAICT) > 5. returns an error from create indicating that we failed to create the VM > > At this point: > > 1) libvirt reports that the VM is stopped (and this is true, the qemu > process has never been issued the 'cont' command and thus won't ever > execute gues tcode) > 2) the qemu process for this VM is still running (just blocked on IO) > > 3) it is possible that if the process becomes unblocked that the QEMU > process will be functional again, but won't be started, and the process > won't be terminated since libvirt isn't tracking this any more, and is > consuming some amount of resources that are allocated on start up. > > > How can we clean up from this failure scenario? Would it make sense for > libvirt to send a SIGTERM to a qemu if it failed to create? In the > above scenario, this would allow us to reap the process if it ever > became unblocked. Looks like I completely missed src/qemu/qemu_process.c:qemuProcessStop() which does indeed send SIGTERM and SIGKILL. This should be sufficient to clean up in the above case. -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx ryanh@xxxxxxxxxx -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list