On 05/06/2011 05:57 AM, Daniel P. Berrange wrote: > Introduce a virProcessKill function that can be safely called > even when the job mutex is held. This allows virDomainDestroy > to kill any VM even if it is asleep in a monitor job. The PID > will die and the thread asleep on the monitor will then wake > up releasing the job mutex. Very nice. This will help issues where if you kill a migration at the source, then the destination is left with a stale VM with the job lock held waiting for the incoming migration to complete. With this, you now have the means to forcibly kill the leftover VM on the destination, rather than waiting for a network timeout. > +++ b/src/qemu/qemu_driver.c > @@ -1482,6 +1482,13 @@ static int qemudDomainDestroy(virDomainPtr dom) { > goto cleanup; > } > > + /* Although qemuProcessStop does this already, there may > + * be an outstanding job active. We want to make sure we > + * can kill the process even if a job is active. Killing > + * it now, means the job will be released s/now,/now/ > + */ > + qemuProcessKill(vm); > + > if (qemuDomainObjBeginJobWithDriver(driver, vm) < 0) > goto cleanup; > > diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c > index a6c0dc8..c60c51f 100644 > --- a/src/qemu/qemu_process.c > +++ b/src/qemu/qemu_process.c > @@ -2369,6 +2369,36 @@ cleanup: > } > > > +void qemuProcessKill(virDomainObjPtr vm) Do you want this to return -1 on failure to kill, such as if waitpid() fails? On the other hand, I guess that best effort is okay; you already did VIR_DEBUG logging of failures, and the caller can't do much even if they were to check for a failure. > +{ > + int i; > + int rc; > + VIR_DEBUG("vm=%s pid=%d", vm->def->name, vm->pid); > + > + if (!virDomainObjIsActive(vm)) { > + VIR_DEBUG("VM '%s' not active", vm->def->name); > + return; > + } > + > + for (i = 0 ; i < 15 ; i++) { > + int signum; > + if (i == 0) > + signum = SIGTERM; > + else if (i == 8) > + signum = SIGKILL; > + else > + signum = 0; /* Just check for existance */ s/existance/existence/ It took me three reads to figure out what this loop is doing; a comment would be helpful: /* Try SIGTERM, then wait 1.6 seconds, then try SIGKILL, all while probing every 200ms to see if the process is gone */ > + > + rc = virKillProcess(vm->pid, signum); > + VIR_DEBUG("Iteration %d rc=%d", i, rc); > + if (rc < 0) > + break; When errno==ESRCH, this breaks the loop because the child is dead. But for any other errno, should you log (at least at VIR_DEBUG) the errno that causes failure, since that means we didn't exit the loop normally? > + > + usleep(200 * 1000); > + } > +} > + ACK with those nits addressed. -- Eric Blake eblake@xxxxxxxxxx +1-801-349-2682 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list