On 03/14/2012 02:16 AM, Paolo Bonzini wrote: >> <domainsnapshot> >> <disks> >> <disk name='/src/base.img' snapshot='external'> >> <source file='/src/snap.img'/> >> <mirror file='/dest/snap.img'/> >> </disk> >> </disks> >> </domainsnapshot> >> would create a new libvirt snapshot object with /src/snap.img as the >> read-write new image, and /dest/snap.img as the new write-only mirror. >> On success, this rewrites the domain's live XML to point to >> /src/snap.img as its current file. > > This is an awfully low-level API; you're designing for oVirt rather than > for everything else. The problem here is twofold: > > 1) you're defining a snapshot that cannot be started without losing the > mirrors. Yes, I don't see any way around that - this proposal will only let you create a mirror with a live qemu session; the moment you start a new qemu session, you have lost the mirroring. I agree that when mirroring is more mature (so that the qemu command line can start a domain with mirroring from the get-go), libvirt will probably need nicer API to expose that. For that matter, someday libvirt needs to expose the entire backing chain in the <domain> XML, including any mirroring, rather than just leaving mirroring to this one-off solution for oVirt in <domainsnapshot>, but that's a bigger project. > > 2) in case the snapshotting is aborted early for any reason, oVirt has > to do a rebase operation manually. This is currently O(size-of-disk), > not O(changes-in-the-last-image), so it wastes both disk space and time. I don't follow the argument for this. It may be a valid complaint, but I'm not yet seeing why you think oVirt has to do a rebase operation manually, or why that operation will cost O(disk) rather than O(changes). If I have: base <- snap1 and request a snapshot that mirrors to snap2 in two locations, but abort half-way through, then I can just call virDomainSnapshotDelete(VIR_DOMAIN_SNAPSHOT_DELETE_METADATA) which makes libvirt forget that it attempted to take a snapshot, but without losing the XML that says that the disk is now based on snap2. That means restarting the domain would use: base <- snap1 <- snap2 as its backing file, and virDomainBlockRebase can be used to initiate a 'block_stream' to collapse it back to a shorter backing chain. > > If it works, I cannot really say "don't do it", but I think the oVirt > mirrored snapshots idea is a dead-end and a workaround for lack of block > device streaming (which is now supported). You could have a simpler, > high-level API based on streaming rather than snapshotting. So, if you > have /src/disk.img as your image, you would have a new API: > > virDomainBlockCopy(dom, "disk", > "/dst/disk.img", "/src/base.img", > bandwidth, flags) Yes, a new API would ultimately be nicer, and allow us to expose more features of the new qemu 'transaction' command. I will probably eventually add something along those lines, but it goes against my stated goal of implementing a first-cut working solution for oVirt that uses just the 0.9.10 API. In other words, for backporting purposes, I'd like a solution that doesn't require a .so bump, even if it is ugly; while saving a new API until we have a bit more experience in all the various cases that users want to make sure the new API can more conveniently cover all of those cases. > >> Finally, virDomainSnapshotDelete will learn a new flag, >> VIR_DOMAIN_SNAPSHOT_DELETE_REOPEN_MIRROR, which says that the libvirt >> snapshot object will be deleted, but only after first calling the qemu >> 'drive-reopen' monitor command for all disks that had a <mirror> in the >> associated snapshot object. That is, for the above example, this would >> reopen the disk from it's current read-write of /src/snap.img over to >> the second storage domain's /dest/snap.img with it's accompanying >> mirrored backing chain. On success, this rewrites the domain's live XML >> to point to the just-opened mirror location. This flag will fail if the >> libvirt snapshot being deleted is not the current image, or if the >> snapshot being deleted does not have any mirrored disks. > > I think you also need VIR_DOMAIN_SNAPSHOT_DELETE_REMOVE_MIRROR, to be > used in case of abort so that the domain can actually be started. Or it > could be an event MIRROR_DROPPED or something like that. Good call. VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY says to drop libvirt's notion of the snapshot object, but it won't stop qemu from mirroring; so an additional flag that tells libvirt to 'drive-reopen' back to the source to discard any mirroring would be handy. I'm not sure whether an event for MIRROR_DROPPED is needed; I guess the only time a mirror is dropped without an explicit action that causes a 'drive-reopen' is where you try to restart a qemu process. But since oVirt is using transient domains, that means that destroying a running qemu process and then starting a new transient domain on the same image loses all snapshot information anyway, so oVirt should already be aware that it has lost the mirroring information as part of tearing down and rebuilding a transient domain. I'll keep the idea of an event in mind, but I'm not sure I see any place where it would be useful. But it does point out that I should probably either prevent the use of a <domainsnapshot> with <mirror> on persistent domains, or at least prevent the use of 'virsh start' on such a persistent domain until the snapshot has been deleted. -- Eric Blake eblake@xxxxxxxxxx +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list