Re: RFC: API additions for enhanced snapshot support

Eric Blake <eblake@xxxxxxxxxx> · Tue, 05 Jul 2011 13:59:52 -0600

On 07/04/2011 08:19 AM, Stefan Hajnoczi wrote:
> On Thu, Jun 16, 2011 at 6:41 AM, Eric Blake <eblake@xxxxxxxxxx> wrote:
> 
> Robert, Fernando, Jagane: I have CCed you because we have discussed
> snapshot APIs and I thought you'd be interested in Eric's work to
> build them for libvirt.
> 
> Does each volume have its own independent snapshot namespace?  It may
> be wise to document that snapshot namespaces are *not* independent
> because storage backends may not be able to provide these semantics.

Good question, and I'm not quite sure on the best way to represent this.

For qcow2 internal snapshots, the answer is obvious - each qcow2 image
has its own snapshot namespace (and you can currently view this with
qemu-img snapshot -l).  But it looks like the existing drive to add qemu
snapshot support to live domains is focusing solely on external
snapshots (that is, a qcow2 snapshot involves creating a new filename,
then marking the old filename as a read-only backing image of the new
qcow2 filename; where qcow2 can even be used as a snapshot around a raw
file).

For external snapshots, I'm not sure whether the namespace should be
specific to a storage pool (that is, any additional metadata that
libvirt needs to track snapshot relationships should be stored in the
same directory as the disk images themselves) or specific to the libvirt
host (that is, just as libvirt's notion of a persistent storage pool
happens to be stored in /etc/libvirt/storage/pool.xml, libvirt should
also manage a file /etc/libvirt/storage/pool/snapshot.xml to describe
all snapshots tracked within "pool").  But I'm certainly thinking that
it is more likely to be a pool-wide namespace, rather than a
storage-volume local namespace.

> 
> I found this formatting quite hard to read.  Something along these
> lines would be easier to parse visually:

Yeah, sometimes there's only so much you can do with plain text emails.
 I'll try to make future revisions of this RFC easier to parse, and/or
get it all in the wiki with nicer formatting first.

>> /* Probe if vol has snapshots.  1 if true, 0 if false, -1 on error.
>> Flags is 0 for now.  */
>> int virStorageVolHasCurrentSnapshot(virStorageVolPtr vol, unsigned int
>> flags);
>> [For qcow2 images, snapshots can be contained within the same file and
>> managed with qemu-img -l, but for other formats, this may mean that
>> libvirt has to start managing externally saved data associated with the
>> storage pool that associates snapshots with filenames.  In fact, even
>> for qcow2 it might be useful to support creation of new files backed by
>> the previous snapshot rather than cramming multiple snapshots in one
>> file, so we may have a use for flags to filter out the presence of
>> single-file vs. multiple-file snapshot setups.]
> 
> What is the "current snapshot"?

I was trying to model this after the virDomainSnapshot API, which has a
notion of current snapshot, but I guess I don't fully understand what
that API was implying by current domain snapshot.

After looking more through the code, it looks like the idea was to
support the notion of a branching hierarchy of snapshots:

A -> B -> D
 \-> C -> E

such that you can revert to the snapshot at point B, start the domain,
and create snapshot D; then stop the domain, revert to the snapshot at
point C, start the domain, and create snapshot E.  Every time you start
the domain, you know from which state it was started, so the current
snapshot is the snapshot that would be the parent if you were to create
a snapshot right now.

As long as you can create a branching hierarchy (that is, after creating
a snapshot, you can revert to its parent state, then make different
changes, and create a new snapshot), then there really is a need to know
which snapshot is current - listing all snapshots can show you the
parent(s) of each snapshot in that list, but can't tell you which
snapshot would be the parent of a new one created right now.

> 
> Is this function necessary when you already have
> virStorageVolSnapshotListNames()?

Maybe I need to understand why we have it for the virDomainSnapshot
case, and whether it still makes sense for a disk image that is not
associated with a domain.  To some degree, I think it seems necessary,
to reflect the fact that with internal qcow2 snapshots, I can do:

qemu-img snapshsot -c one file
run then stop vm to modify file
qemu-img snapshot -c two file
qemu-img snapshot -a one file
run then stop vm to modify file
qemu-img snapshot -c three file

with the resulting hierarchy:

one -> two
   \-> three

On the other hand, qemu-img doesn't appear to list any hierarchies
between internal snapshots - that is, while 'qemu-img snapshot -l' will
list one, two, and three, it gives no indication that three depends on
one but not two, nor whether the current state of the file would be a
delta against three, two, one, or even parent-less.

This also starts to get into questions about the ability to split a
qcow2 image with internal snapshots.  That is, if I have a single file
with snapshot one and a delta against that snapshot as the current disk
state, it would be nice to create a new qcow2 file with identical
contents to snapshot one, then rebase the existing qcow2 file to have a
backing file of my new clone file and delete the internal snapshot from
the original file.  But this starts to sound like work on live block
copy APIs.  For an offline storage volume, we can do things manually
(qemu-img snapshot -c to temporarily create yet another snapshot point
to later return to, qemu-img snapshot -a to revert to the snapshot of
interest, then qemu-img convert to copy off the contents, then qemu-img
snapshot -a to the temporary state, then qemu-img snapshot -d to clean
up both the temporary andstate).  But for a storage volume currently in
use by qemu, this would imply a new qemu command to have qemu assist in
streaming out the contents of the snapshot state.

> 
>> /* Return the most recent snapshot of a volume, if one exists, or NULL
>> on failure.  Flags is 0 for now.  */
>> virStorageVolSnapshotPtr virStorageVolSnapshotCurrent(virStorageVolPtr
>> vol, unsigned int flags);
> 
> The name should include "revert".  This looks like a shortcut function
> for virStorageVolRevertToSnapshot().

No, it was intended as a counterpart to virDomainSnapshotCurrent, which
returns the "current snapshot" if there is one.  But again, it may be
that while a "current snapshot" makes sense for a domain, it might not
make sense for a storage volume in isolation.

> 
>> /* Delete the storage associated with a snapshot (although the opaque
>> snapshot object must still be independently freed).  If flags is 0, any
>> child snapshots based off of this one are rebased onto the parent; if
>> flags is VIR_STORAGE_VOL_SNAPSHOT_DELETE_CHILDREN , then any child
>> snapshots based off of this one are also deleted.  */
> 
> What is the "opaque snapshot object"?

Any instance of virDomainStorageSnapshotPtr.  It is opaque because the
caller does not need to know the contents of struct
_virDomainStorageSnapshot.

> 
>> int virStorageVolSnapshotDelete(virStorageVolSnapshotPtr snapshot,
>> unsigned int flags);
>> [For qcow2, this would involve qemu-img snapshot -d.  For
>> multiple-file snapshots, this would also involve qemu-img commit.]
> 
> Is virStorageVolDelete() modified to also delete the volume's snapshots?

I think this has come up elsewhere in the thread, but yes,
virStorageVolDelete needs to be taught whether to error out if snapshots
exist, or whether to also delete all snapshots associated with a storage
volume being deleted.

-- 
Eric Blake   eblake@xxxxxxxxxx    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

Attachment:
signature.asc

Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list