Re: [PATCH v2 6/8] Implemented post-copy migration logic in qemu

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 30, 2014 at 16:39:27 +0200, Cristian Klein wrote:
> Perform phase stops once migration switched to post-copy.
> Confirm phase waits for post-copy to finish before killing the VM.
> 
> Signed-off-by: Cristian Klein <cristian.klein@xxxxxxxxx>
> ---
>  src/qemu/qemu_driver.c    |  8 ++++++++
>  src/qemu/qemu_migration.c | 25 ++++++++++++++++++++++---
>  2 files changed, 30 insertions(+), 3 deletions(-)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index e873d45..3fe2216 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -10942,6 +10942,14 @@ qemuDomainMigratePrepare2(virConnectPtr dconn,
>  
>      virCheckFlags(QEMU_MIGRATION_FLAGS, -1);
>  
> +    if (flags & VIR_MIGRATE_POSTCOPY) {
> +        /* post-copy migration does not work with Sequence v2 */
> +        virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                       _("Post-copy migration requested but not "
> +                         "supported by v2 protocol"));
> +        goto cleanup;
> +    }
> +

This code should be unreachable. If both source and destination daemons
support VIR_MIGRATE_POSTCOPY, they support v3 protocol as well. And a
client new enough to specify VIR_MIGRATE_POSTCOPY will also support v3
migration protocol. However, it doesn't hurt to check this to be safe.

>      if (flags & VIR_MIGRATE_TUNNELLED) {
>          /* this is a logical error; we never should have gotten here with
>           * VIR_MIGRATE_TUNNELLED set
> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
> index 4a36946..436b701 100644
> --- a/src/qemu/qemu_migration.c
> +++ b/src/qemu/qemu_migration.c
> @@ -2039,6 +2039,11 @@ qemuMigrationUpdateJobStatus(virQEMUDriverPtr driver,
>          ret = 0;
>          break;
>  
> +    case QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY_ACTIVE:
> +        jobInfo->type = VIR_DOMAIN_JOB_PHASE1_COMPLETED;
> +        ret = 0;
> +        break;
> +

This will need to be dropped after 5/8 is removed. However, it reminds
me...

    enum {
        QEMU_MONITOR_MIGRATION_STATUS_INACTIVE,
        QEMU_MONITOR_MIGRATION_STATUS_ACTIVE,
        QEMU_MONITOR_MIGRATION_STATUS_COMPLETED,
        QEMU_MONITOR_MIGRATION_STATUS_ERROR,
        QEMU_MONITOR_MIGRATION_STATUS_CANCELLED,
        QEMU_MONITOR_MIGRATION_STATUS_SETUP,
        QEMU_MONITOR_MIGRATION_STATUS_POSTCOPY_ACTIVE,

        QEMU_MONITOR_MIGRATION_STATUS_LAST
    };

in qemu_monitor.h needs to be turned into a proper typedef so that

    switch (status.status) {

line in qemuMigrationUpdateJobStatus may be changed to explicitly
mention the enum so that the compiler may report a warning whenever we
add new status but forgot to handle it in this switch. Which means the
new state will need to be handled in the same patch it was introduced,
i.e, in 3/8.

>      case QEMU_MONITOR_MIGRATION_STATUS_INACTIVE:
>          jobInfo->type = VIR_DOMAIN_JOB_NONE;
>          virReportError(VIR_ERR_OPERATION_FAILED,
> @@ -2077,6 +2082,7 @@ qemuMigrationWaitForCompletion(virQEMUDriverPtr driver,
>      qemuDomainJobInfoPtr jobInfo = priv->job.current;
>      const char *job;
>      int pauseReason;
> +    bool inPhase2 = (jobInfo->type == VIR_DOMAIN_JOB_PHASE1_COMPLETED);

I think it would be cleaner to pass this info in a new parameter for
qemuMigrationWaitForCompletion instead of doing a hidden black magic
here.

>  
>      switch (priv->job.asyncJob) {
>      case QEMU_ASYNC_JOB_MIGRATION_OUT:
> @@ -2092,9 +2098,11 @@ qemuMigrationWaitForCompletion(virQEMUDriverPtr driver,
>          job = _("job");
>      }
>  
> -    jobInfo->type = VIR_DOMAIN_JOB_UNBOUNDED;
> +    if (!inPhase2)
> +        jobInfo->type = VIR_DOMAIN_JOB_UNBOUNDED;
>  
> -    while (jobInfo->type == VIR_DOMAIN_JOB_UNBOUNDED) {
> +    while (jobInfo->type == VIR_DOMAIN_JOB_UNBOUNDED ||
> +           (inPhase2 && jobInfo->type == VIR_DOMAIN_JOB_PHASE1_COMPLETED)) {

Just use jobInfo->status.status directly.

>          /* Poll every 50ms for progress & to allow cancellation */
>          struct timespec ts = { .tv_sec = 0, .tv_nsec = 50 * 1000 * 1000ull };
>  
> @@ -2123,7 +2131,8 @@ qemuMigrationWaitForCompletion(virQEMUDriverPtr driver,
>          virObjectLock(vm);
>      }
>  
> -    if (jobInfo->type == VIR_DOMAIN_JOB_COMPLETED) {
> +    if (jobInfo->type == VIR_DOMAIN_JOB_COMPLETED ||
> +        jobInfo->type == VIR_DOMAIN_JOB_PHASE1_COMPLETED) {

This shouldn't be needed when 5/8 is dropped.

>          qemuDomainJobInfoUpdateDowntime(jobInfo);
>          VIR_FREE(priv->job.completed);
>          if (VIR_ALLOC(priv->job.completed) == 0)
> @@ -3149,6 +3158,16 @@ qemuMigrationConfirmPhase(virQEMUDriverPtr driver,
>  
>      virCheckFlags(QEMU_MIGRATION_FLAGS, -1);
>  
> +    /* Wait for post-copy to complete */
> +    if (flags & VIR_MIGRATE_POSTCOPY) {
> +        bool abort_on_error = !!(flags & VIR_MIGRATE_ABORT_ON_ERROR);
> +        rv = qemuMigrationWaitForCompletion(driver, vm,
> +                                            QEMU_ASYNC_JOB_MIGRATION_OUT,
> +                                            conn, abort_on_error);
> +        if (rv < 0)
> +            goto cleanup;
> +    }
> +

This has to be done after setting the phase. We may need to add a new so
that we can recover from post-copy migrations when libvirtd is
restarted.

>      qemuMigrationJobSetPhase(driver, vm,
>                               retcode == 0
>                               ? QEMU_MIGRATION_PHASE_CONFIRM3

Jirka

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list




[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]