On Thu, Dec 07, 2023 at 08:52:39PM +0800, tugy@xxxxxxxxxxxxxxx wrote: > From: Guoyi Tu <tugy@xxxxxxxxxxxxxxx> > > Currently, libvirt creates a thread pool with only on thread to handle all > qemu monitor events for virtual machines, In the cases that if the thread > gets stuck while handling a monitor EOF event, such as unable to kill the > virtual machine process or release resources, the events of other virtual > machine will be also blocked, which will lead to the abnormal behavior of > other virtual machines. > > For instance, when another virtual machine completes a shutdown operation > and the monitor EOF event has been queued but remains unprocessed, we > immediately destroy and start the virtual machine again, at a later time > when EOF event get processed, the processMonitorEOFEvent() will kill the > virtual machine that just started. > > To address this issue, in the processMonitorEOFEvent(), we check whether > the current virtual machine's id is equal to the the one at the time > the event was generated. If they do not match, we immediately return. > > Signed-off-by: Guoyi Tu <tugy@xxxxxxxxxxxxxxx> > Signed-off-by: dengpengcheng <dengpc12@xxxxxxxxxxxxxxx> > --- > src/qemu/qemu_domain.c | 2 +- > src/qemu/qemu_driver.c | 11 +++++++++-- > src/qemu/qemu_process.c | 2 +- > 3 files changed, 11 insertions(+), 4 deletions(-) Reviewed-by: Daniel P. Berrangé <berrange@xxxxxxxxxx> > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c > index 64afae6450..cfc5b79657 100644 > --- a/src/qemu/qemu_driver.c > +++ b/src/qemu/qemu_driver.c > @@ -3854,7 +3854,8 @@ processJobStatusChangeEvent(virDomainObj *vm, > > static void > processMonitorEOFEvent(virQEMUDriver *driver, > - virDomainObj *vm) > + virDomainObj *vm, > + int domid) > { > qemuDomainObjPrivate *priv = vm->privateData; > int eventReason = VIR_DOMAIN_EVENT_STOPPED_SHUTDOWN; > @@ -3863,6 +3864,12 @@ processMonitorEOFEvent(virQEMUDriver *driver, > unsigned int stopFlags = 0; > virObjectEvent *event = NULL; > > + if (vm->def->id != domid) { > + VIR_ERROR("Domain %s was restarted, ignoring EOF", > + vm->def->name); > + return; > + } Since this race scenario is something we expect to happen periodically, we don't want VIR_ERROR to be used, as it can make admins (incorrectly) believe something is broken. I'm going to replace this with VIR_DEBUG when I merge it. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| _______________________________________________ Devel mailing list -- devel@xxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxx