On 12/7/18 8:47 AM, Marek Marczykowski-Górecki wrote:
Hi, If one kills a domain with `xl destroy`, libvirt will not notice it and still report the domain as running. Also trying to destroy the domain through libvirt will fail. The only way to recover from such a situation is to restart libvirt daemon.
Ouch.
While (I think) the right answer is "don't use xl destroy", libvirt could do better.
Nod. Restarting libvirtd is pretty drastic.
libxl actually report such cases via LIBXL_EVENT_TYPE_DOMAIN_DEATH, but it is ignored by libvirt's libxl driver.
Why don't we handle that in libxlDomainEventHandler()?
The problem is that LIBXL_EVENT_TYPE_DOMAIN_DEATH is also reported when it's triggered by libvirt (for example libxlDomainShutdownHandleDestroy, or libxlDomainDestroyFlags).
Ah, yes, now I remember why :-).
For a proper solution I need a race-free way for checking if given LIBXL_EVENT_TYPE_DOMAIN_DEATH was caused by libvirt (and is already handled - including setting appropriate domain object state etc). A bit of context: https://github.com/libvirt/libvirt/blob/master/src/libxl/libxl_domain.c#L750-L765 libxlDomainDestroyInternal(libxlDriverPrivatePtr driver, virDomainObjPtr vm) { libxlDriverConfigPtr cfg = libxlDriverConfigGet(driver); int ret = -1; /* Unlock virDomainObj during destroy, which can take considerable * time on large memory domains. */ virObjectUnlock(vm); ret = libxl_domain_destroy(cfg->ctx, vm->def->id, NULL); virObjectLock(vm); virObjectUnref(cfg); return ret; } Example usage: https://github.com/libvirt/libvirt/blob/master/src/libxl/libxl_driver.c#L1357-L1409 static int libxlDomainDestroyFlags(virDomainPtr dom, unsigned int flags) { (...) if (libxlDomainObjBeginJob(driver, vm, LIBXL_JOB_MODIFY) < 0) goto cleanup; if (virDomainObjCheckActive(vm) < 0) goto endjob; if (libxlDomainDestroyInternal(driver, vm) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _("Failed to destroy domain '%d'"), vm->def->id); goto endjob; } virDomainObjSetState(vm, VIR_DOMAIN_SHUTOFF, VIR_DOMAIN_SHUTOFF_DESTROYED); event = virDomainEventLifecycleNewFromObj(vm, VIR_DOMAIN_EVENT_STOPPED, VIR_DOMAIN_EVENT_STOPPED_DESTROYED); (...) } It's tricky, because domain object lock is released for the libxl_domain_destroy() call time and I think there is no guarantee that the lock will be taken back by libxlDomainDestroyInternal() before LIBXL_EVENT_TYPE_DOMAIN_DEATH is reported (and possibly handled). Which means I can't use domain object state for checking if libvirt already know about this domain death, because it is set to VIR_DOMAIN_SHUTOFF, after libxlDomainDestroyInternal() call. And while setting domain object state twice (once from LIBXL_EVENT_TYPE_DOMAIN_DEATH handler, then from libxlDomainDestroyInternal() caller) wouldn't be that bad (only ugly), but emitting domain lifecycle event twice would be a major problem.
Yes, that is a problem.
Any ideas? Some internal flag set by libxlDomainDestroyInternal() just before libxl_domain_destroy() call? Where would be a place for such thing?
In the libxlDomainObjPrivate object, defined in libxl_domain.h? Regards, Jim -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list