Re: [PATCH] qemu: allow blkstat/blkinfo calls during migration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/11/2011 07:26 AM, Federico Simoncelli wrote:
> Originally most of libvirt domain-specific calls were blocking
> during a migration.
> A new mechanism to allow specific calls (blkstat/blkinfo) to be
> executed in such condition has been implemented.
> In the long term it'd be desirable to get a more general
> solution to mark further APIs as migration safe, without needing
> special case code.
> 
>  * src/qemu/qemu_migration.c: add some additional job signal
>    flags for doing blkstat/blkinfo during a migration
>  * src/qemu/qemu_domain.c: add a condition variable that can be
>    used to efficiently wait for the migration code to clear the
>    signal flag
>  * src/qemu/qemu_driver.c: execute blkstat/blkinfo using the
>    job signal flags during migration
> ---
>  src/qemu/qemu_domain.c    |    6 +++
>  src/qemu/qemu_domain.h    |    9 ++++
>  src/qemu/qemu_driver.c    |  103 ++++++++++++++++++++++++++++++++-------------
>  src/qemu/qemu_migration.c |   35 +++++++++++++++

This conflicts with patch 10/16 in Dan's migration patches [1].  I'll
hold off pushing it until Dan's patches are in, and hopefully the rebase
is not too difficult.  You could try the rebase now on his preview
repository [2]

[1] https://www.redhat.com/archives/libvir-list/2011-May/msg00771.html
[2] http://gitorious.org/~berrange/libvirt/staging/commits/migrate-locking-3

That said, though, it looks like you addressed my findings from v1.
Unfortunately, I still see a couple of places for improvement.

> +++ b/src/qemu/qemu_domain.c
> @@ -110,6 +110,11 @@ static void *qemuDomainObjPrivateAlloc(void)
>      if (VIR_ALLOC(priv) < 0)
>          return NULL;
>  
> +    if (virCondInit(&priv->signalCond) < 0) {
> +        VIR_FREE(priv);
> +        return NULL;

As long as we're touching this function, let's also fix the bug where
priv->jobCond was never initialized.

> +    if ((priv->jobActive == QEMU_JOB_MIGRATION_OUT)
> +        || (priv->jobActive == QEMU_JOB_SAVE)) {
> +        virDomainObjRef(vm);
> +        while (priv->jobSignals & QEMU_JOB_SIGNAL_BLKSTAT)
> +            ignore_value(virCondWait(&priv->signalCond, &vm->lock));

Hmm, should we mark priv->jobSignals as volatile in the header, to
ensure the compiler won't optimize this into an infinite loop?

> +++ b/src/qemu/qemu_migration.c
> @@ -158,6 +158,38 @@ qemuMigrationWaitForCompletion(struct qemud_driver *driver, virDomainObjPtr vm)
>                  VIR_WARN0("Unable to set migration speed");
>          }

The earlier 'else if' chain should likewise be broken into consecutive
'if's.  In Dan's patch, this moved into the new method
qemuMigrationProcessJobSignals

>  
> +        if (priv->jobSignals & QEMU_JOB_SIGNAL_BLKSTAT) {
> +            qemuDomainObjEnterMonitorWithDriver(driver, vm);
> +            rc = qemuMonitorGetBlockStatsInfo(priv->mon,

I'm still wondering if we need to hold the signalLock condition during
the duration where we drop driver lock to call out to the monitor.  I
can't convince myself that we need to, but I also can't convince myself
that your code is safe without it (I guess it goes to show that I
haven't done much programming on condition variables in any prior job -
they're cool tools, but hard to wrap your head around when first
learning them).

> +                                  priv->jobSignalsData.statDevName,
> +                                  &priv->jobSignalsData.blockStat->rd_req,
> +                                  &priv->jobSignalsData.blockStat->rd_bytes,
> +                                  &priv->jobSignalsData.blockStat->wr_req,
> +                                  &priv->jobSignalsData.blockStat->wr_bytes,
> +                                  &priv->jobSignalsData.blockStat->errs);
> +            qemuDomainObjExitMonitorWithDriver(driver, vm);
> +
> +            priv->jobSignalsData.statRetCode = rc;
> +            priv->jobSignals ^= QEMU_JOB_SIGNAL_BLKSTAT;
> +
> +            if (rc < 0)
> +                VIR_WARN0("Unable to get block statistics");
> +        }
> +
> +        if (priv->jobSignals & QEMU_JOB_SIGNAL_BLKINFO) {

Oh dear, I just realized a bug.  By breaking 'else if' into separate
'if', we now need to check if the VM is alive after every time we regain
the driver lock (that is, after every qemuDomainObjExitMonitorWithDriver).

Hmm, maybe the better approach is to keep things as an 'else if' chain,
but to make qemuMigrationProcessJobSignals be a while loop that iterates
until all bits are serviced (so that the live vm check is factored to
only appear once in the loop body).  Back when these checks were part of
the larger qemuMigrationWaitForCompletion function, the overall loop
also included a sleep, which is what I didn't like; but with Dan's patch
in place, a loop with no sleep seems reasonable.

> +            qemuDomainObjEnterMonitorWithDriver(driver, vm);
> +            rc = qemuMonitorGetBlockExtent(priv->mon,
> +                               priv->jobSignalsData.infoDevName,
> +                               &priv->jobSignalsData.blockInfo->allocation);
> +            qemuDomainObjExitMonitorWithDriver(driver, vm);
> +
> +            priv->jobSignalsData.infoRetCode = rc;
> +            priv->jobSignals ^= QEMU_JOB_SIGNAL_BLKINFO;
> +
> +            if (rc < 0)
> +                VIR_WARN0("Unable to get block information");

Hmm, your rebase still hasn't picked up the latest patches; you need to
s/VIR_WARN0/VIR_WARN/ per commit b65f37a.

-- 
Eric Blake   eblake@xxxxxxxxxx    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list

[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]