On Thu, May 12, 2022 at 10:39:50 +0200, Peter Krempa wrote: > On Tue, May 10, 2022 at 17:21:08 +0200, Jiri Denemark wrote: > > The check can reveal a serious bug in our migration code and we should > > not silently ignore it. > > > > Signed-off-by: Jiri Denemark <jdenemar@xxxxxxxxxx> > > --- > > src/qemu/qemu_migration.c | 58 ++++++++++++++++++++++++--------------- > > 1 file changed, 36 insertions(+), 22 deletions(-) > > > > diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c > > index 5b6073b963..1c5dd9b391 100644 > > --- a/src/qemu/qemu_migration.c > > +++ b/src/qemu/qemu_migration.c > > @@ -147,9 +147,10 @@ qemuMigrationCheckPhase(virDomainObj *vm, > > > > if (phase < QEMU_MIGRATION_PHASE_POSTCOPY_FAILED && > > phase < priv->job.phase) { > > - VIR_ERROR(_("migration protocol going backwards %s => %s"), > > - qemuMigrationJobPhaseTypeToString(priv->job.phase), > > - qemuMigrationJobPhaseTypeToString(phase)); > > + virReportError(VIR_ERR_INTERNAL_ERROR, > > + _("migration protocol going backwards %s => %s"), > > + qemuMigrationJobPhaseTypeToString(priv->job.phase), > > + qemuMigrationJobPhaseTypeToString(phase)); > > This bit seems to belongs to the previous commit actually, but since > it's not used anywhere else ... It is intentionally here as the previous commit is just a code movement with no functional change. > > > return -1; > > } > > > > [...] > > > @@ -4920,7 +4928,7 @@ qemuMigrationSrcPerformPeer2Peer2(virQEMUDriver *driver, > > * until the migration is complete. > > */ > > VIR_DEBUG("Perform %p", sconn); > > - qemuMigrationJobSetPhase(vm, QEMU_MIGRATION_PHASE_PERFORM2); > > + ignore_value(qemuMigrationJobSetPhase(vm, QEMU_MIGRATION_PHASE_PERFORM2)); > > if (flags & VIR_MIGRATE_TUNNELLED) > > ret = qemuMigrationSrcPerformTunnel(driver, vm, st, NULL, > > NULL, 0, NULL, NULL, > > @@ -5164,7 +5172,7 @@ qemuMigrationSrcPerformPeer2Peer3(virQEMUDriver *driver, > > * confirm migration completion. > > */ > > VIR_DEBUG("Perform3 %p uri=%s", sconn, NULLSTR(uri)); > > - qemuMigrationJobSetPhase(vm, QEMU_MIGRATION_PHASE_PERFORM3); > > + ignore_value(qemuMigrationJobSetPhase(vm, QEMU_MIGRATION_PHASE_PERFORM3)); > > VIR_FREE(cookiein); > > cookiein = g_steal_pointer(&cookieout); > > cookieinlen = cookieoutlen; > > Any reason why you want to ignore this before the migration was > performed? > > > @@ -5189,7 +5197,7 @@ qemuMigrationSrcPerformPeer2Peer3(virQEMUDriver *driver, > > if (ret < 0) { > > virErrorPreserveLast(&orig_err); > > } else { > > - qemuMigrationJobSetPhase(vm, QEMU_MIGRATION_PHASE_PERFORM3_DONE); > > + ignore_value(qemuMigrationJobSetPhase(vm, QEMU_MIGRATION_PHASE_PERFORM3_DONE)); > > } > > > > /* If Perform returns < 0, then we need to cancel the VM > > I could somehwat understand it here after the migration is done, but a > bug could be also in this code. Mostly because we're in a p2p migration where everything is done within a single API call and thus it cannot really fail. > > > @@ -5657,7 +5667,9 @@ qemuMigrationSrcPerformPhase(virQEMUDriver *driver, > > return ret; > > } > > > > - qemuMigrationJobStartPhase(vm, QEMU_MIGRATION_PHASE_PERFORM3); > > + if (qemuMigrationJobStartPhase(vm, QEMU_MIGRATION_PHASE_PERFORM3) < 0) > > + goto endjob; > > + > > virCloseCallbacksUnset(driver->closeCallbacks, vm, > > qemuMigrationSrcCleanup); > > > > @@ -5671,7 +5683,7 @@ qemuMigrationSrcPerformPhase(virQEMUDriver *driver, > > goto endjob; > > } > > > > - qemuMigrationJobSetPhase(vm, QEMU_MIGRATION_PHASE_PERFORM3_DONE); > > + ignore_value(qemuMigrationJobSetPhase(vm, QEMU_MIGRATION_PHASE_PERFORM3_DONE)); > > > > if (virCloseCallbacksSet(driver->closeCallbacks, vm, conn, > > qemuMigrationSrcCleanup) < 0) > > Same here. It is similar to the p2p case... we already started a phase in the same API just a few lines above and thus this call cannot really fail. I guess I could add actual handling in all the cases here for consistency even though it would effectively be a dead code. I chose ignore_value() as it is less work :-) Jirka