On Fri, Jan 18, 2013 at 02:39:11PM +0000, Daniel P. Berrange wrote: > From: "Daniel P. Berrange" <berrange@xxxxxxxxxx> > > When running virDomainDestroy, we need to make sure that no other > background thread cleans up the domain while we're doing our work. > This can happen if we release the domain object while in the > middle of work, because the monitor might detect EOF in this window. > For this reason we have a 'beingDestroyed' flag to stop the monitor > from doing its normal cleanup. Unfortunately this flag was only > being used to protect qemuDomainBeginJob, and not qemuProcessKill > > This left open a race condition where either libvirtd could crash, > or alternatively report bogus error messages about the domain already > having been destroyed to the caller > > Signed-off-by: Daniel P. Berrange <berrange@xxxxxxxxxx> > --- > src/qemu/qemu_driver.c | 19 ++++++++++++------- > 1 file changed, 12 insertions(+), 7 deletions(-) > > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c > index c28c223..2a6f381 100644 > --- a/src/qemu/qemu_driver.c > +++ b/src/qemu/qemu_driver.c > @@ -2121,24 +2121,29 @@ qemuDomainDestroyFlags(virDomainPtr dom, > > qemuDomainSetFakeReboot(driver, vm, false); > > + > + /* We need to prevent monitor EOF callback from doing our work (and sending > + * misleading events) while the vm is unlocked inside BeginJob/ProcessKill API > + */ > + priv->beingDestroyed = true; > + > /* Although qemuProcessStop does this already, there may > * be an outstanding job active. We want to make sure we > * can kill the process even if a job is active. Killing > * it now means the job will be released > */ > if (flags & VIR_DOMAIN_DESTROY_GRACEFUL) { > - if (qemuProcessKill(driver, vm, 0) < 0) > + if (qemuProcessKill(driver, vm, 0) < 0) { > + priv->beingDestroyed = false; > goto cleanup; > + } > } else { > - if (qemuProcessKill(driver, vm, VIR_QEMU_PROCESS_KILL_FORCE) < 0) > + if (qemuProcessKill(driver, vm, VIR_QEMU_PROCESS_KILL_FORCE) < 0) { > + priv->beingDestroyed = false; > goto cleanup; > + } > } > > - /* We need to prevent monitor EOF callback from doing our work (and sending > - * misleading events) while the vm is unlocked inside BeginJob API > - */ > - priv->beingDestroyed = true; > - > if (qemuDomainObjBeginJobWithDriver(driver, vm, QEMU_JOB_DESTROY) < 0) > goto cleanup; > > -- > 1.8.0.2 This is a really difficult bug to reproduce using libguestfs. I've only reproduced it a handful of times in about 3 months. I would say, push it, based on: (1) Rationale and patch looks sane. (2) It fixes your minimal test case (which is a strong enough reason in itself). Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/ -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list