On 11/21/2012 07:16 AM, Peter Krempa wrote: >>>> + /* wipe and re-create disk images - qemu-img doesn't care if it >>>> exists*/ >>>> + if (qemuDomainSnapshotCreateInactiveExternal(driver, vm, snap, >>>> false) < 0) >>>> + goto endjob; >>> >>> This comment says you are throwing away state, but without the use of >>> any flag guarding things that the user meant to throw away state. I'm a >>> bit worried that we're missing a flag here to state that we are >>> reverting to the point of the external snapshot _and_ want to throw away >>> all changes that happened in the external file. >>> >>> Also, does qemuDomainSnapshotCreateInactiveExternal(,,,false) actually >>> do what you want here? I thought that invocation fails if the file >>> already exists (and passing true assumes that the file already exists, >>> but does not otherwise touch it). Do we instead need to change that >>> boolean into an 'int', with multiple values (0 and 1 for when the user >>> first creates the snapshot, depending on whether they passed the >>> REUSE_EXT flag, and 2 for this function, which says to forcefully >>> recreate)? >> >> After thinking a bit more, I think we need the following additional >> flags to virDomainRevertToSnapshot, and require that exactly one of >> these three mutually exclusive flags is present when reverting to an >> external snapshot: >> >> VIR_DOMAIN_SNAPSHOT_REVERT_FOLLOW - revert to the current state of disks >> after an external snapshot (I'm about to post work on this flag - yes, >> it will conflict with what you've done so far) > > With this flag, the description of the API function is starting to drift > away from its original "revertTo" semantic. This flag is useful in the > meaning it has but I'm not quite sure if I like the implications it > would have: > > 1) You cannot do this with a checkpoint unless you wish to revert > without the memory state. Yes, but if you were careful to shut down first before leaving a branch, that's what you want; and if you (management app) took a snapshot prior to leaving a branch, then you would revert to that NEW snapshot, rather than using revert-with-follow to the old snapshot. > > 2) Even if you shut down the machine correctly you will need to remember > that and revert a checkpoint to the offline state with this flag. That's where the already-existing REVERT_RUNNING and REVERT_PAUSED flags come into play - revert-with-follow would default to shut off, but you can request that the guest be booted from the current disk state. > > This is the original reason I was talking about the automatic > meta-snapshots that could be used to revert to a current state of a > branch that was left. When you want to leave a branch in live state you > have to create a checkpoint in any case so this flag would be usable > just for disk snapshots. Rather, the result of using this flag would be as if the automatic meta-snapshot was always a disk snapshot. If you leave an offline branch, all is well; if you leave an online branch, then the disk may have lost pending I/O (so don't revert from a running machine, if you want to return to that branch). The idea of doing an automatic meta-snapshot prior to leaving a branch may still make sense, but would it be internal or external? Offline snapshots (whether internal or external) are relatively fast; but online snapshots take time no matter whether they are internal or external (with internal having the further complication of having to use qcow2 disks). I guess we'll have to continue thinking about automatic snapshots before leaving a branch, but it doesn't affect the usefulness of your patches so far. > >> VIR_DOMAIN_SNAPSHOT_REVERT_RESET - revert to the point where an external >> snapshot was taken, and reset the external files to be exactly that >> state again (the semantics of _this_ patch) > > and the original meaning of the "revertTo" semantic of this API Hmm, maybe we don't need a flag here after all for your patch. Thinking about the internal snapshot case, it's important to remember that there is always an implicit state (the current), as well as any number of named states in a qcow2 file. Each cluster has a reference count based on how many entry states can reach that cluster (with copy-on-write semantics any time the current state goes to write to a cluster). Consider the following setup of snapshots (in 'virsh snapshot-list --tree' format): a | +- b | +- c | | | +- d | +- e created by taking snapshot 'a', taking snapshot 'b', taking snapshot 'c', taking snapshot 'd', reverting to snapshot 'b', then taking snapshot 'e'. The action of reverting to 'b' did NOT invalidate snapshots c or d, but did discard any state changes that happened after 'd'. Another way of looking at this is that reverting to 'b' really means changing the implicit 'current' tip to initially be identical to 'b'. Looking at your patch, you are refusing to revert to anything but a leaf snapshot; that's good, because unlike the internal case, the moment we wipe an external file, we have lost any internal snapshots it may contain, and we have also invalidated any further external snapshots that use it as a backing file. So, your code will permit a revert to exactly one of two places: 'd' (if already executing on branch c/d) and 'e' (if already executing on branch e), and if there is no automatic snapshot taken at the tip, then the user is already throwing away running state, the same as they would be for reverting to an internal snapshot. Okay, I think you've convinced me that your patch doesn't need a flag, and that it matches the semantics of internal revert, or else fails because it can't match the semantics (at which point, you can use --force to throw things away, or my revert-and-commit to create a new branch from arbitrary points rather than leaf snapshots). >> >> VIR_DOMAIN_SNAPSHOT_REVERT_COMMIT - revert to the current state of disks >> after an external snapshot, but commit the state of the external files >> into the backing files. Not possible if there is another snapshot >> branched off the same backing file (unless we want to let _FORCE allow >> us to potentially invalidate those other branches) > > ... and also invalidates checkpoints. Well, here we could recreate the checkpoint at the time of the new revert. After all, a commit operation (without a discard) says that we don't need to go back to the earlier point in time, but we still want to remember something at the current point in time. Also, this operation would be time-consuming (disk commits are slower than creating new qcow2 wrappers), so it would need to be under an async job and allow cancellation (but what does that mean to the snapshot if you cancel half-way through?). > >> >> I can also see the use for a revert-and-delete operation rolled into >> one, via a new flag: >> >> VIR_DOMAIN_SNAPSHOT_REVERT_DISCARD - after reverting to a snapshot, >> delete it (that is, we no longer plan to revert to it again in the >> future). When mixed with FOLLOW, we keep the backing chain with no >> change to disk contents; when mixed with RESET, we keep the backing >> chain but reset the backing chain to start fresh; and when mixed with >> COMMIT, we shorten the backing chain back to its pre-snapshot length. > > This might be useful, but is basicaly syntax sugar (when snapshot > deleting will be in place). Not necessarily sugar - if you follow my argument about COMMIT needing to re-create the checkpoint state to the current point in time without this flag, then using this flag can omit that recreation step and definitely save time. But we can worry about that more when we actually get around to writing a COMMIT implementation. > Right now, the only way this API can be used is if the snapshot is the > last in the chain or if you specify the FORCE flag, so with the given > state of libvirt right now you lose only the changes that are not a part > of a checkpoint/snapshot if you revert somewhere _or_ you specified > --force and then you sign of that you want to do anything. So the --force flag does let the user invalidate other snapshots/backing file chains/whatever, but in that case they asked for it. Meanwhile, since your version can only revert to leaf snapshots, and my revert-and-create can revert anywhere (to create a branch), and my proposed revert-and-follow flag can then switch branches (but up to the management to ensure sane state before leaving a branch), I think we are doing pretty good on the revert front. I'll post a rebase of your patch series merged with mine (as I'm finding I need some of your code to implement my revert-and-follow stuff). > > As of re-creating the disk images, I tried the approach manually and > qemu-img doesn't care much if the file exists and overwrites it. Indeed, now that I think about it more: via the public API, we reject things up-front if the image file already exists; and since our backdoor didn't repeat the check, then we get the right behavior for both the public API (the file doesn't exist, so create it as empty) and your revert API (forcefully recreate the file as empty). Cool! > >> >> I think I could ACK this patch if you respun it to require my proposed >> RESET flag to match your intent of discarding user disk state. Also, >> before you decide too much, you'd better read up on my respin with my >> proposed FOLLOW flag. Updated plan of attack - I'll repost your series with the final tweaks I think it needs as part of my series, and you can take that as my ACK of your code. So, I'm now working on the revert-with-follow semantics, and am assuming that you are working on the delete code. -- Eric Blake eblake@xxxxxxxxxx +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list