On 11/01/2012 10:22 AM, Peter Krempa wrote: > This patch adds support to take external system checkpoints. > > The functionality is layered on top of the previous disk-only snapshot > code. When the checkpoint is requested the domain memory is saved to the > memory image file using migration to file. (The user may specify to > do take the memory image while the guest is live with the s/do // > VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag.) > > The memory save image shares format with the image created by > virDomainSave() API. > --- > This version adds support for pausing the guest while the snapshot is being taken. > If this happens while the migration job is running the machine is unpaused after the snapshot is done. > (This should mirror the behavior of live migration). As there isn't any support for aborting other jobs > than migration this should be safe. > --- > src/qemu/qemu_driver.c | 242 +++++++++++++++++++++++++++++++++---------------- > 1 file changed, 162 insertions(+), 80 deletions(-) > > > +qemuDomainSnapshotCreateDiskActive(struct qemud_driver *driver, > + virDomainObjPtr vm, > virDomainSnapshotObjPtr snap, > - unsigned int flags) > + unsigned int flags, > + enum qemuDomainAsyncJob asyncJob) > { > - virDomainObjPtr vm = *vmptr; > qemuDomainObjPrivatePtr priv = vm->privateData; > virJSONValuePtr actions = NULL; > - bool resume = false; > int ret = -1; > int i; > bool persist = false; > int thaw = 0; /* 1 if freeze succeeded, -1 if freeze failed */ Ouch. You left the quiesce logic in qemuDomainSnapshotCreateDiskActive, but changed the caller so that this point can be reached after the domain has been already paused. We already reject _QUIESCE without _DISK_ONLY, so we know there is no memory to worry about, but the quiesce must occur before pausing things. That means you need to move the quiesce and thaw logic out of CreateDiskActive and into CreateActiveExternal. > +static int > +qemuDomainSnapshotCreateActiveExternal(virConnectPtr conn, > + struct qemud_driver *driver, > + virDomainObjPtr *vmptr, > + virDomainSnapshotObjPtr snap, > + unsigned int flags) > +{ > + /* we need to resume the guest only if it was previously running */ > + if (virDomainObjGetState(vm, NULL) == VIR_DOMAIN_RUNNING) { > + resume = true; > + > + /* For multiple disks, libvirt must pause externally to get all > + * snapshots to be at the same point in time, unless qemu supports > + * transactions. For a single disk, snapshot is atomic without > + * requiring a pause. Thanks to qemuDomainSnapshotPrepare, if > + * we got to this point, the atomic flag now says whether we need > + * to pause, and a capability bit says whether to use transaction. > + */ > + if ((memory && !(flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE)) || > + (atomic && !transaction)) { Hmm, I'm thinking I still got this slightly wrong: if memory is specified, the we are guaranteed that the guest will pause before the disk side of things, even if we support transaction. The only time we can get away without pausing the guest is for disk-only and transaction support. But as written, this would end up pausing if memory and _LIVE are specified but we lack transaction, which undoes the point of _LIVE. So we need a !memory conjunct in the second half of ||. > + > + /* now the domain is now paused if: > + * - if a memory snapshot was requested > + * - an atomic snapshot was requested AND > + * qemu does not support transactions > + * > + * Next we snapshot the disks. > + */ > + if ((ret = qemuDomainSnapshotCreateDiskActive(driver, vm, snap, flags, > + QEMU_ASYNC_JOB_SNAPSHOT)) < 0) > + goto endjob; > @@ -11428,21 +11494,37 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, > } > } > > + /* reject the VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag where not supported */ > + if ((flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE) && > + (!virDomainObjIsActive(vm) || > + snap->def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL)) { > + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", > + _("live snapshot creation is supported only " > + "with external checkpoints")); > + goto cleanup; > + } More on why you can't mix _QUIESCE with _LIVE external memory snapshots: we don't control when the guest will finally converge and pause, so we can't trigger the guest agent to quiesce at the right time. We don't want to quiesce up front, as that defeats the purpose of a live snapshot, if the guest can't do any I/O for the entire live migration-to-file. So here, we should error out if _QUIESCE and _LIVE are both given; or even stronger - error out if both _QUIESCE and external memory are requested (it is always easier to relax later if someone can prove it would be useful, than it is to provide it now and wish we hadn't). On the bright side, this forbidden combination is no real loss: if you are snapshotting memory, then you don't need to worry about inconsistent disk state, since you have the memory that goes along with any in-flight I/O at the time of the snapshot - quiescing only makes sense if you are going to save disk state without in-flight I/O to go along with it. Looking at the code, we already forbid _QUIESCE if _DISK_ONLY is not present; and re-reading my table in commit 4201a7ea, all that remains is to reject the combination of DISK_ONLY and a memory snapshot request. Likewise, _LIVE and _REDEFINE are incompatible. Hmm, I see that src/libvirt.c already filters out some impossible flag combinations, so maybe my proposals below should be moved. I'm playing with squashing this in, but I also need to fix the quiesce outside pause issue before I can give ACK, if you can beat me to it. diff --git i/src/qemu/qemu_driver.c w/src/qemu/qemu_driver.c index d5cfdcc..35c8670 100644 --- i/src/qemu/qemu_driver.c +++ w/src/qemu/qemu_driver.c @@ -11498,14 +11498,23 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, } /* reject the VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag where not supported */ - if ((flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE) && + if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_LIVE && (!virDomainObjIsActive(vm) || - snap->def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL)) { + snap->def->memory != VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL || + flags & VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE)) { virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", _("live snapshot creation is supported only " "with external checkpoints")); goto cleanup; } + if ((snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_EXTERNAL || + snap->def->memory == VIR_DOMAIN_SNAPSHOT_LOCATION_INTERNAL) && + flags & VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY) { + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", + _("disk-only snapshot creation is not compatible with " + "memory snapshot")); + goto cleanup; + } /* actually do the snapshot */ if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_REDEFINE) { -- Eric Blake eblake@xxxxxxxxxx +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list