On Thu, Sep 01, 2022 at 17:18:55 +0200, Jiri Denemark wrote: > On Thu, Sep 01, 2022 at 16:51:34 +0200, Peter Krempa wrote: > > On Thu, Sep 01, 2022 at 14:47:41 +0200, Jiri Denemark wrote: > > > We have always considered "migrate_cancel" QMP command to return after > > > successfully cancelling the migration. But this is no longer true (to be > > > honest I'm not sure it ever was) as it just changes the migration state > > > to "cancelling". In most cases the migration is canceled pretty quickly > > > and we don't really notice anything, but sometimes it takes so long we > > > even get to clearing migration capabilities before the migration is > > > actually canceled, which fails as capabilities can only be changed when > > > no migration is running. So to avoid this issue, we can wait for the > > > migration to be really canceled after sending migrate_cancel. The only > > > place where we don't need synchronous behavior is when we're cancelling > > > migration on user's request while it is actively watched by another > > > thread. > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=2114866 > > > > > > Signed-off-by: Jiri Denemark <jdenemar@xxxxxxxxxx> > > > --- > > > src/qemu/qemu_driver.c | 2 +- > > > src/qemu/qemu_migration.c | 45 +++++++++++++++++++++++++++++++++++---- > > > src/qemu/qemu_migration.h | 3 ++- > > > src/qemu/qemu_process.c | 2 +- > > > 4 files changed, 45 insertions(+), 7 deletions(-) > > > > [...] > > > > > int > > > qemuMigrationSrcCancel(virDomainObj *vm, > > > - virDomainAsyncJob asyncJob) > > > + virDomainAsyncJob asyncJob, > > > + bool wait) > > > { > > > qemuDomainObjPrivate *priv = vm->privateData; > > > > > > @@ -4625,6 +4653,15 @@ qemuMigrationSrcCancel(virDomainObj *vm, > > > qemuMonitorMigrateCancel(priv->mon); > > > qemuDomainObjExitMonitor(vm); > > > > > > + if (virDomainObjIsActive(vm) && wait) { > > > > Is the call to virDomainObjIsActive() necessary here? IIUC the domain > > shutdown code is always executed in a way to make sure that waiting > > threads are always woken. > > > > > > > + VIR_DEBUG("Waiting for migration to be canceled"); > > > + > > > + while (!qemuMigrationSrcIsCanceled(vm)) { > > > + if (qemuDomainObjWait(vm) < 0) > > > > So here if the VM would crash before we wait we'd report success and if > > it crashed during our wait we'll report failure, which seems weird too. > > Oh right, qemuDomainObjWait already checks for virDomainObjIsActive so > we don't have to do it explicitly here. Just > > if (wait) { > ... > } > > is enough. Reviewed-by: Peter Krempa <pkrempa@xxxxxxxxxx>