Re: [PATCH v2 8/8] virsh: add postcopy-after option to migrate command

Cristian Klein <cristian.klein@xxxxxxxxx> · Thu, 9 Oct 2014 16:22:54 +0900

On 07 Oct 2014, at 23:48 , Jiri Denemark <jdenemar@xxxxxxxxxx> wrote:

> On Tue, Sep 30, 2014 at 16:39:29 +0200, Cristian Klein wrote:
>> Signed-off-by: Cristian Klein <cristian.klein@xxxxxxxxx>
>> ---
>> tools/virsh-domain.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++--
>> tools/virsh.pod      |  5 ++++
>> 2 files changed, 75 insertions(+), 2 deletions(-)
>> 
>> diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c
>> index 36a6d52..395c73c 100644
>> --- a/tools/virsh-domain.c
>> +++ b/tools/virsh-domain.c
>> @@ -9251,6 +9251,10 @@ static const vshCmdOptDef opts_migrate[] = {
>>      .type = VSH_OT_INT,
>>      .help = N_("force guest to suspend if live migration exceeds timeout (in seconds)")
>>     },
>> +    {.name = "postcopy-after",
>> +     .type = VSH_OT_INT,
>> +     .help = N_("switch to post-copy migration if live migration exceeds timeout (in seconds)")
>> +    },
>>     {.name = "xml",
>>      .type = VSH_OT_STRING,
>>      .help = N_("filename containing updated XML for the target")
> 
> We also want to add --postcopy option to enable just
> VIR_MIGRATE_POSTCOPY without an automatic switch to post-copy after a
> timeout. The --postcopy-after option may turn it on automatically,
> though, so that users don't have to use --postcopy --postcopy-after. And
> we also want to add a new migrate-startpostcopy command as a direct
> wrapper to virDomainMigrateStartPostCopy.

I’m still not convinced this is a good idea. To use —postcopy and migrate-startpostcopy, the user would have to launch a second instance of virsh, since the first one is busy monitoring the migration. I feel that this is a bit too low level for a tool like virsh: It would be as if we offered the user a dedicated migrate-cancel command, instead of pressing CTRL+C.

>> @@ -9332,6 +9336,8 @@ doMigrate(void *opaque)
>>         VIR_FREE(xml);
>>     }
>> 
>> +    if (vshCommandOptBool(cmd, "postcopy-after")) /* actually an int */
>> +        flags |= VIR_MIGRATE_POSTCOPY;
>>     if (vshCommandOptBool(cmd, "live"))
>>         flags |= VIR_MIGRATE_LIVE;
>>     if (vshCommandOptBool(cmd, "p2p"))
>> @@ -9423,6 +9429,50 @@ vshMigrationTimeout(vshControl *ctl,
>>     virDomainSuspend(dom);
>> }
>> 
>> +static void
>> +vshMigrationPostCopyAfter(vshControl *ctl,
>> +                    virDomainPtr dom,
>> +                    void *opaque ATTRIBUTE_UNUSED)
>> +{
>> +    vshDebug(ctl, VSH_ERR_DEBUG, "starting post-copy\n");
>> +    int rv = virDomainMigrateStartPostCopy(dom, 0);
>> +    if (rv < 0) {
>> +        vshError(ctl, "%s", _("start post-copy command failed"));
>> +    } else {
>> +        vshDebug(ctl, VSH_ERR_INFO, "switched to post-copy\n");
>> +    }
>> +}
>> +
>> +/* Parse the --postcopy-after parameter in seconds, but store its
>> + * value in milliseconds. Return -1 on error, 0 if
>> + * no timeout was requested, and 1 if timeout was set.
>> + * Copy-paste-adapted from vshCommandOptTimeoutToMs.
>> + */
>> +static int
>> +vshCommandOptPostCopyAfterToMs(vshControl *ctl, const vshCmd *cmd, int *postCopyAfter)
>> +{
>> +    int rv = vshCommandOptInt(cmd, "postcopy-after", postCopyAfter);
>> +
>> +    if (rv < 0 || (rv > 0 && *postCopyAfter < 0)) {
>> +        vshError(ctl, "%s", _("invalid postcopy-after parameter"));
>> +        return -1;
>> +    }
>> +    if (rv > 0) {
>> +        /* Ensure that we can multiply by 1000 without overflowing. */
>> +        if (*postCopyAfter > INT_MAX / 1000) {
>> +            vshError(ctl, "%s", _("post-copy after parameter is too big"));
>> +            return -1;
>> +        }
>> +        *postCopyAfter *= 1000;
>> +        /* 0 is a special value inside virsh, which means no timeout, so
>> +         * use 1ms instead for "start post-copy immediately"
>> +         */
>> +        if (*postCopyAfter == 0)
>> +            *postCopyAfter = 1;
>> +    }
>> +    return rv;
>> +}
>> +
>> static bool
>> cmdMigrate(vshControl *ctl, const vshCmd *cmd)
>> {
>> @@ -9432,6 +9482,7 @@ cmdMigrate(vshControl *ctl, const vshCmd *cmd)
>>     bool verbose = false;
>>     bool functionReturn = false;
>>     int timeout = 0;
>> +    int postCopyAfter = 0;
>>     bool live_flag = false;
>>     vshCtrlData data = { .dconn = NULL };
>> 
>> @@ -9451,6 +9502,18 @@ cmdMigrate(vshControl *ctl, const vshCmd *cmd)
>>         goto cleanup;
>>     }
>> 
>> +    if (vshCommandOptPostCopyAfterToMs(ctl, cmd, &postCopyAfter) < 0) {
>> +        goto cleanup;
>> +    } else if (postCopyAfter > 0 && !live_flag) {
>> +        vshError(ctl, "%s",
>> +                 _("migrate: Unexpected postcopy-after for offline migration"));
> 
> Offline migration is a migration of a domain which is not running at
> all. We don't have a good name for migration of a paused domain, some
> refer to it as coma migration. Anyway, this is checked for in the daemon
> so checking here is redundant. Especially when this check seems to be
> somewhat artificial. Sure, it doesn't make a lot of sense but post-copy
> migration of a paused domain should be doable. Moreover, I believe you
> can call virDomainSuspend during a post-copy migration, which is very
> close to running migration without —live.
> 
>> +        goto cleanup;
>> +    } else if (postCopyAfter > 0 && timeout > 0) {
>> +        vshError(ctl, "%s",
>> +                 _("migrate: --postcopy-after is incompatible with --timeout"));
>> +        goto cleanup;
>> +    }
> 
> Anyway, handling --postcopy-after seems quite complicated since we have
> to repeat the logic around -1, 0, 1 return values in two places. Can't
> we just directly fetch and process the value in cmdMigrate?
> 
>> +
>>     if (pipe(p) < 0)
>>         goto cleanup;
>> 

I’ll fix the other issues you have raised.

Cristian

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list