On 07 Oct 2014, at 23:02 , Jiri Denemark <jdenemar@xxxxxxxxxx> wrote: > On Tue, Sep 30, 2014 at 16:39:27 +0200, Cristian Klein wrote: >> Perform phase stops once migration switched to post-copy. >> Confirm phase waits for post-copy to finish before killing the VM. >> >> Signed-off-by: Cristian Klein <cristian.klein@xxxxxxxxx> >> --- >> src/qemu/qemu_driver.c | 8 ++++++++ >> src/qemu/qemu_migration.c | 25 ++++++++++++++++++++++--- >> 2 files changed, 30 insertions(+), 3 deletions(-) >> >> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c >> index e873d45..3fe2216 100644 >> --- a/src/qemu/qemu_driver.c >> +++ b/src/qemu/qemu_driver.c >> @@ -10942,6 +10942,14 @@ qemuDomainMigratePrepare2(virConnectPtr dconn, >> >> virCheckFlags(QEMU_MIGRATION_FLAGS, -1); >> >> + if (flags & VIR_MIGRATE_POSTCOPY) { >> + /* post-copy migration does not work with Sequence v2 */ >> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", >> + _("Post-copy migration requested but not " >> + "supported by v2 protocol")); >> + goto cleanup; >> + } >> + > > This code should be unreachable. If both source and destination daemons > support VIR_MIGRATE_POSTCOPY, they support v3 protocol as well. And a > client new enough to specify VIR_MIGRATE_POSTCOPY will also support v3 > migration protocol. However, it doesn't hurt to check this to be safe. > >> if (flags & VIR_MIGRATE_TUNNELLED) { >> /* this is a logical error; we never should have gotten here with >> * VIR_MIGRATE_TUNNELLED set >> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c >> index 4a36946..436b701 100644 >> --- a/src/qemu/qemu_migration.c >> +++ b/src/qemu/qemu_migration.c >> @@ -2039,6 +2039,11 @@ qemuMigrationUpdateJobStatus(virQEMUDriverPtr driver, >> ret = 0; >> break; >> >> + case QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY_ACTIVE: >> + jobInfo->type = VIR_DOMAIN_JOB_PHASE1_COMPLETED; >> + ret = 0; >> + break; >> + > > This will need to be dropped after 5/8 is removed. However, it reminds > me... > > enum { > QEMU_MONITOR_MIGRATION_STATUS_INACTIVE, > QEMU_MONITOR_MIGRATION_STATUS_ACTIVE, > QEMU_MONITOR_MIGRATION_STATUS_COMPLETED, > QEMU_MONITOR_MIGRATION_STATUS_ERROR, > QEMU_MONITOR_MIGRATION_STATUS_CANCELLED, > QEMU_MONITOR_MIGRATION_STATUS_SETUP, > QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY_ACTIVE, > > QEMU_MONITOR_MIGRATION_STATUS_LAST > }; > > in qemu_monitor.h needs to be turned into a proper typedef so that > > switch (status.status) { > > line in qemuMigrationUpdateJobStatus may be changed to explicitly > mention the enum so that the compiler may report a warning whenever we > add new status but forgot to handle it in this switch. What about `QEMU_MONITOR_MIGRATION_STATUS_LAST`? Should this be included in a switch with an assertion that that code should never be reached? > Which means the > new state will need to be handled in the same patch it was introduced, > i.e, in 3/8. > >> case QEMU_MONITOR_MIGRATION_STATUS_INACTIVE: >> jobInfo->type = VIR_DOMAIN_JOB_NONE; >> virReportError(VIR_ERR_OPERATION_FAILED, >> @@ -2077,6 +2082,7 @@ qemuMigrationWaitForCompletion(virQEMUDriverPtr driver, >> qemuDomainJobInfoPtr jobInfo = priv->job.current; >> const char *job; >> int pauseReason; >> + bool inPhase2 = (jobInfo->type == VIR_DOMAIN_JOB_PHASE1_COMPLETED); > > I think it would be cleaner to pass this info in a new parameter for > qemuMigrationWaitForCompletion instead of doing a hidden black magic > here. I agree with you, I wasn’t sure about with this way of coding things either. Is it okey to ask developers to always create a new intermediate parameter to call this function, so as to make its usage easier to read? E.g., “”” bool return_on_postcopy_active = false; rv = qemuMigrationWaitForCompletion(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT, conn, abort_on_error, return_on_postcopy_active); “”" > >> >> switch (priv->job.asyncJob) { >> case QEMU_ASYNC_JOB_MIGRATION_OUT: >> @@ -2092,9 +2098,11 @@ qemuMigrationWaitForCompletion(virQEMUDriverPtr driver, >> job = _("job"); >> } >> >> - jobInfo->type = VIR_DOMAIN_JOB_UNBOUNDED; >> + if (!inPhase2) >> + jobInfo->type = VIR_DOMAIN_JOB_UNBOUNDED; >> >> - while (jobInfo->type == VIR_DOMAIN_JOB_UNBOUNDED) { >> + while (jobInfo->type == VIR_DOMAIN_JOB_UNBOUNDED || >> + (inPhase2 && jobInfo->type == VIR_DOMAIN_JOB_PHASE1_COMPLETED)) { > > Just use jobInfo->status.status directly. > >> /* Poll every 50ms for progress & to allow cancellation */ >> struct timespec ts = { .tv_sec = 0, .tv_nsec = 50 * 1000 * 1000ull }; >> >> @@ -2123,7 +2131,8 @@ qemuMigrationWaitForCompletion(virQEMUDriverPtr driver, >> virObjectLock(vm); >> } >> >> - if (jobInfo->type == VIR_DOMAIN_JOB_COMPLETED) { >> + if (jobInfo->type == VIR_DOMAIN_JOB_COMPLETED || >> + jobInfo->type == VIR_DOMAIN_JOB_PHASE1_COMPLETED) { > > This shouldn't be needed when 5/8 is dropped. > >> qemuDomainJobInfoUpdateDowntime(jobInfo); >> VIR_FREE(priv->job.completed); >> if (VIR_ALLOC(priv->job.completed) == 0) >> @@ -3149,6 +3158,16 @@ qemuMigrationConfirmPhase(virQEMUDriverPtr driver, >> >> virCheckFlags(QEMU_MIGRATION_FLAGS, -1); >> >> + /* Wait for post-copy to complete */ >> + if (flags & VIR_MIGRATE_POSTCOPY) { >> + bool abort_on_error = !!(flags & VIR_MIGRATE_ABORT_ON_ERROR); >> + rv = qemuMigrationWaitForCompletion(driver, vm, >> + QEMU_ASYNC_JOB_MIGRATION_OUT, >> + conn, abort_on_error); >> + if (rv < 0) >> + goto cleanup; >> + } >> + > > This has to be done after setting the phase. We may need to add a new so > that we can recover from post-copy migrations when libvirtd is > restarted. Correct me if I’m wrong, but isn’t the phase already set to `QEMU_MIGRATION_PHASE_CONFIRM3` by `qemuMigrationConfirm`, which calls `qemuMigrationConfirmPhase`? Also, I’m not sure we need another phase. If libvirtd is restarted, it should continue with phase CONFIRM3, wait for post-copy migration to finish or observe that the VM has finished post-copy migration, and kill it. Am I missing something? > >> qemuMigrationJobSetPhase(driver, vm, >> retcode == 0 >> ? QEMU_MIGRATION_PHASE_CONFIRM3 Cristian -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list