Re: [RFC PATCH 1/n] snapshot: add revert-and-create branching of external snapshots

Peter Krempa <pkrempa@xxxxxxxxxx> · Wed, 14 Nov 2012 12:26:20 +0100




On 11/14/12 01:47, Eric Blake wrote:
On 11/13/2012 08:26 AM, Peter Krempa wrote:
On 11/09/12 05:47, Eric Blake wrote:
Right now, libvirt refuses to revert to the state at the time
of an external snapshot, because doing so would reset things so
that the next time the domain boots, we are using the backing
file; but modifying the backing file invalidates all qcow2
files that are based on top of it.  There are three possibilities
for lifting this restriction:
1. delete all snapshot metadata and qcow2 files that are
invalidated by the revert (losing changes since the snapshot)
2. perform a block commit (such as with qemu-img commit) to
merge the qcow2 file back into the backing file (keeping the
changes since the snapshot)
3. create NEW qcow2 files that wrap the same base file as
the original snapshot (keeping the changes since the original
snapshot)

We will need to think of a mechanism how to return to a branch of the
snapshot tree that was previously left by this revert-and-branch
operation. Otherwise you will need either to:
1) create a new snapshot on the previous branch prior leaving it
2) if your machine was running when you were leaving your branch, it
will need to be reverted to the previous snapshot state
3) we will need to add a ability to add a automagic temporary snapshot
that will not create new disk overlays but will store the memory image
to allow reverting to the end of the branch

Hmm, if the guest is offline and you are branching away from an external
snapshot, the disk image already contains the old state.  But you are
right that for an online guest, it might be worth having a way to save a
checkpoint just before the revert - and this is true whether the point
you are going to was an internal or an external snapshot (although for
first implementation, I will only allow branching from an external
snapshot).  In fact, this is true even if we don't add the atomic
revert-and-create operation - even a plain revert can abandon runtime
state that might be nice to remember.

My original idea was that specifying <branch> means that <memory> is
ignored on the incoming <domainsnapshot> XML; but maybe I should instead
state that the <memory> controls what to do with the branch-point being
abandoned: snapshot=no says discard the runtime state (what revert has
always done, even without the create operation merged in), and
'snapshot=internal name=...' or 'snapshot=external file=...' says to
create a new checkpoint before abandoning running state (so the
revert-and-create branch operation would actually be creating two
snapshots at once).

That looks like a reasonable option.


Or we could just say that this is the responsibility of management apps
to avoid abandoning running state - anyone wanting to return to a
running domain is responsible for first pausing and saving that domain
as an independent operation, and then only use the revert-and-create
from an offline or paused state.  In a way, it would be easier to insist
that branching only ever be done from an offline state; but transient
domains throw a wrench in that idea (there is never an offline state for
transient domains).

In that case we will need to hack that up in virsh anyways. But I think 
it makes a bit more sense to have that functionality in the management 
app as trying to do it in a way that could be limited in some aspect.


Another thought - maybe snapshots should remember if the domain has ever
been unpaused after the snapshot was first taken.  That is, a snapshot
starts life clean as long as the domain remains offline or paused, but
is marked dirty the first time the guest is allowed to run (or if any
changes are made to the domain definition, such as a hotplug); knowing
if a snapshot is dirty would determine whether an additional snapshot is
even needed before leaving that branch in a revert operation.

Maybe virDomainRevertToSnapshot() needs a way to do a create-and-revert
mode of operation; and then virDomainSnapshotCreateXML needs a
create-revert-create operation.  Perhaps it is time to introduce a new
API to give better control over a snapshot being created before the
revert, vs. the snapshot being created after the revert; although that
would not be something that can be backported to older releases
(sometimes there are interesting design trade-offs for what can be done
without changing the .so entry points).

I'll think about that some more as I continue writing patches.  But for
now, since our existing revert has always discarded runtime state (and
left it to management apps to do their own snapshot prior to revert if
they don't want to lose anything), keeping that as the status quo
doesn't hurt.

Ah yeah. I agree on this one at this point. I'm just making those points 
for what I'd like to see from a users perspective.


1.
Enhance virDomainSnapshotListFlags to add two new filter groups:
VIR_DOMAIN_SNAPSHOT_LIST_OFFLINE
VIR_DOMAIN_SNAPSHOT_LIST_ONLINE
VIR_DOMAIN_SNAPSHOT_LIST_DISK_ONLY

VIR_DOMAIN_SNAPSHOT_LIST_INTERNAL
VIR_DOMAIN_SNAPSHOT_LIST_EXTERNAL

Now done in a separate thread.

4.
Think about ramifications on revert - right now, with no options,
revert means to go back to the state where the snapshot was created;
but now that we allow the creation of branches, and since the branches
have external disk state which persists even when not on the branch, we
need a new revert flag that says to revert to an external snapshots
and its subsequent changes

Ah yeah, this would be applicable. This would work if you shut down your
guest prior to leaving the branch (or can justify loosing data.) And if
you want to return to an existing state, you still can create another
snapshot. Although it would be cool to have it automatic.

For an online guest, if we allow the create-before-revert, then the
solution is to revert to the checkpoint taken before we did the revert.

Or as you said: Leave this stuff on the mgmt app. That seems a bit more 
reasonable for me now. (But it'd be great to add that as a single 
operation in virsh)

  But for an offline guest, we shouldn't need a magic snapshot at the end
of a branch; so I still think we need a new mode to revert that says to
revert to the changes beyond the external snapshot.

In case of an offline guest, we will need to remember it was offline 
when we were leaving the branch (in case the previous snapshot was 
active/online) and from the other side we'd need to remember that the 
guest was started and the running state was discarded (and was started 
from an offline snapshot). Otherwise we might mess stuff up.


5.
Think about ramifications on delete - for example, if two branches
of external snapshots share a common memory state file, and we delete
only one of the two branches, the memory state file must not be
deleted.

Or we could use the name provided in the new snapshot XML to store the
same memory image (although copying it might not be that cool, it would
save a ton of logic to wire that up).

Or use hard-links instead of copies, and let the file system do the
reference counting for us :)  But if we do that, _and_ we do the idea of
create-revert-create for marking the old point prior to creating the new
branch, then the user has to be able to specify the external memory file
name of both snapshots.

Well but the specified path might be on other filesystem. Users are 
sometimes doing crazy things :)

Hm that would make the snapshot XMLs even more complex. I think that at 
the start we should go for the basic functionality and leave the burden 
of doing this on the mgmt app.


Lots of ideas; thankfully, I don't see a problem with adding features
incrementally.

+++ b/docs/formatsnapshot.html.in
@@ -87,7 +87,12 @@
         sets that snapshot as current, and the prior current snapshot is
         the parent of the new snapshot.  Branches in the hierarchy can
         be formed by reverting to a snapshot with a child, then creating
-      another snapshot.
+      another snapshot.  In the case of external snapshots, modifying
+      the backing files would invalidate all external files that
+      depend on the backing file, so the action of reverting to a
+      snapshot must be accompanied by either a request to delete all
+      invalidated snapshots, or to create a new snapshot at the same
+      time as the revert.

Hm, this comment should probably be in my patch adding the revert
functionality for external snapshots. (With a few changes stating for
example that --force invalidates the image chain and doesn't care)

Well, the phrase about 'create a new snapshot at the same time as the
revert' applies to this patch, but the phrase about 'request to delete
all invalidated snapshots' indeed goes better with yours.  I'm okay if
the documentation is slightly inaccurate as our patches converge, as
long as the final documentation in 1.0.1 matches what we have together.
  So it may be best to just separate this into its own patch, to be
applied after both our code approaches have been merged.

Hm, yeah, that would also make sense, if we make it before the release 
so we don't end up with incomplete docs.


+++ b/docs/schemas/domainsnapshot.rng
@@ -85,6 +85,11 @@
               </element>
             </element>
           </optional>
+        <optional>
+          <element name='parent'>

Doesn't match the docs. s/parent/branch/

Oops.  Proof that I need to add a test case for RNG validation.

+ * If @flags includes VIR_DOMAIN_SNAPSHOT_CREATE_BRANCH, then @xmlDesc
+ * must describe an existing external snapshot as well as actions to
+ * create a new snapshot from the point where the existing snapshot was
+ * created rather than the current state of the guest.  In this mode,
+ * the new snapshot branch is created but not activated unless the
+ * VIR_DOMAIN_SNAPSHOT_CREATE_CURRENT flag is also present.  It is an
+ * error to request VIR_DOMAIN_SNAPSHOT_CREATE_LIVE or
+ * VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE in this mode.

Actually, if we add a create-revert-create ability, then _LIVE would
apply to the first create (the branch we are leaving).  But it
definitely doesn't work for the branch we are creating.

@@ -18673,6 +18690,14 @@ error:
    * inactive snapshots with a @flags request to start the domain after
    * the revert.
    *
+ * Reverting to a snapshot with external state and then running the
+ * domain would invalidate all external files that depend on the backing
+ * file.  Therefore, this function will fail unless any child external
+ * snapshots of the revert point first been deleted or merged.  It is
+ * also possible to effectively revert and create a new snapshot branch
+ * in one operation, without invalidating snapshots, by using the
+ * VIR_DOMAIN_SNAPSHOT_CREATE_BRANCH flag of
virDomainSnapshotCreateXML().
+ *

This also probably should go into the revert series.

Indeed, also a candidate for a final doc cleanup patch after both of our
improvements are in.

@@ -340,6 +349,19 @@ cmdSnapshotCreateAs(vshControl *ctl, const vshCmd
*cmd)
           flags |= VIR_DOMAIN_SNAPSHOT_CREATE_ATOMIC;
       if (vshCommandOptBool(cmd, "live"))
           flags |= VIR_DOMAIN_SNAPSHOT_CREATE_LIVE;
+    if (vshCommandOptBool(cmd, "branch")) {
+        flags |= VIR_DOMAIN_SNAPSHOT_CREATE_BRANCH;
+        if (vshCommandOptString(cmd, "branch", &branch) < 0) {
+            vshError(ctl, _("argument must not be empty"));

This error message isn't very helpful when viewed out of context. Add
the parameter name to make it clear please.

Copy and paste, so I'll also fix the point it copied from.


Looks good. There are a few details we should sort out before doing this
but I think this is useful and the right way.

I was also thinking of enhancing virsh with a ability to generate memory
image names from the snapshot names when the user only provides the
prefix. This saves a lot of hassle when doing multiple snapshots in a
row with the "[up-arrow][enter]" approach.

In fact, the names we generate for --disk-only are generated by libvirt
(snapshot_conf.c:virDomainSnapshotAlignDisks), not virsh; so if we come
up with a reasonable approach for auto-naming an external memory
snapshot, it should be done in the same place.  But yes, an ability to
autoname external snapshots would be a nice independent patch.

I was thinking about that quite a lot, but didn't figure any good enough 
option for the memory. We might save it where the disks are stored or 
create a special folder for that. But I didn't like any of them that much.




--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list


--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list