On 11/01/2012 10:22 AM, Peter Krempa wrote: > This patch adds limited support for deleting external snaphots. The s/snaphots/snapshots/ > machine must not be active and only whole subtrees of snapshots can be > deleted as reparenting was not yet implemented for external snapshots. These are reasonable restrictions for the first implementation. We may relax some of them later - for example, if qemu adds support for doing blockcommit from the active layer, then that would let us do a live deletion of an external snapshot. > --- > This patch introduces a possible segfault, but I haven't had time to chase it. Can you describe the steps you reproduced it with? I'm not sure I saw it either. > +++ b/src/qemu/qemu_domain.c > @@ -1724,13 +1724,102 @@ qemuDomainSnapshotForEachQcow2(struct qemud_driver *driver, > op, try_all, def->ndisks); > } > > -/* Discard one snapshot (or its metadata), without reparenting any children. */ > -int > -qemuDomainSnapshotDiscard(struct qemud_driver *driver, > - virDomainObjPtr vm, > - virDomainSnapshotObjPtr snap, > - bool update_current, > - bool metadata_only) > +/* Discard one external snapshot (or its metadata), without reparenting any children */ > +static int > +qemuDomainSnapshotDiscardExternal(struct qemud_driver *driver, > + virDomainObjPtr vm, > + virDomainSnapshotObjPtr snap, > + bool update_current, > + bool metadata_only) > +{ > + char *snapFile = NULL; > + int i; > + int ret = -1; > + virDomainSnapshotObjPtr parentsnap = NULL; > + > + if (!metadata_only) { > + if (virDomainObjIsActive(vm)) { > + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, > + _("Can't delete snapshot '%s'. Domain '%s' is active."), > + snap->def->name, vm->def->name); > + goto cleanup; > + } > + > + if (snap->def->file) { > + if (unlink(snap->def->file) < 0) { > + virReportSystemError(errno, > + _("Failed to remove memory checkpoint " > + "save file '%s'"), snap->def->file); > + goto cleanup; > + } > + VIR_FREE(snap->def->file); > + } This part makes sense. > + > + for (i = 0; i < snap->def->ndisks; i++) { > + virDomainSnapshotDiskDefPtr disk = &(snap->def->disks[i]); > + if (disk->file && unlink(disk->file) < 0) { > + virReportSystemError(errno, > + _("Failed to remove disk snapshot file '%s'"), > + disk->file); > + goto cleanup; > + } > + VIR_FREE(disk->file); > + } But this part is rather drastic. More on this below. > + } > + From here... > + if (snap == vm->current_snapshot) { > + if (update_current && snap->def->parent) { > + parentsnap = virDomainSnapshotFindByName(vm->snapshots, > + snap->def->parent); > + if (!parentsnap) { > + VIR_WARN("missing parent snapshot matching name '%s'", > + snap->def->parent); > + } else { > + parentsnap->def->current = true; > + > + if (qemuDomainSnapshotWriteMetadata(vm, parentsnap, > + driver->snapshotDir) < 0) { > + VIR_WARN("failed to set parent snapshot '%s' as current", > + snap->def->parent); > + parentsnap->def->current = false; > + parentsnap = NULL; > + } > + } > + } > + vm->current_snapshot = parentsnap; > + } > + > + if (virAsprintf(&snapFile, "%s/%s/%s.xml", driver->snapshotDir, > + vm->def->name, snap->def->name) < 0) { > + virReportOOMError(); > + goto cleanup; > + } > + > + if (unlink(snapFile) < 0) > + VIR_WARN("Failed to unlink %s", snapFile); > + virDomainSnapshotObjListRemove(vm->snapshots, snap); ...to here, the code is almost identical to the internal snapshot case, except that you reordered things so that they are unsafe (that is, the internal case ensures that we malloc in order to detect any OOM _prior_ to modifying any parent snapshot). I think the tail of these two functions should instead be common functionality... > + > + ret = 0; > + > +cleanup: > + if (ret < 0 && !snapFile && > + qemuDomainSnapshotWriteMetadata(vm, snap, driver->snapshotDir) < 0) > + VIR_WARN("Failed to store modified snapshot config"); > + > + VIR_FREE(snapFile); > + > + return ret; > +} > + > +/* Discard one internal snapshot (or its metadata), > + * without reparenting any children. > + */ > +static int > +qemuDomainSnapshotDiscardInternal(struct qemud_driver *driver, > + virDomainObjPtr vm, > + virDomainSnapshotObjPtr snap, > + bool update_current, > + bool metadata_only) > { > char *snapFile = NULL; > int ret = -1; > @@ -1791,6 +1880,25 @@ cleanup: > return ret; > } > > +/* Discard one snapshot (or its metadata), without reparenting any children. */ > +int > +qemuDomainSnapshotDiscard(struct qemud_driver *driver, > + virDomainObjPtr vm, > + virDomainSnapshotObjPtr snap, > + bool update_current, > + bool metadata_only) > +{ > + int ret; > + > + if (virDomainSnapshotIsExternal(snap)) > + ret = qemuDomainSnapshotDiscardExternal(driver, vm, snap, update_current, metadata_only); > + else > + ret = qemuDomainSnapshotDiscardInternal(driver, vm, snap, update_current, metadata_only); ...here. > + > + return ret; > +} > + > + > /* Hash iterator callback to discard multiple snapshots. */ > void qemuDomainSnapshotDiscardAll(void *payload, > const void *name ATTRIBUTE_UNUSED, > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c > index 44bf6dc..46b7e3a 100644 > --- a/src/qemu/qemu_driver.c > +++ b/src/qemu/qemu_driver.c > @@ -12376,6 +12376,13 @@ qemuDomainSnapshotReparentChildren(void *payload, > return; > } > > + if (virDomainSnapshotIsExternal(snap)) { > + virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s", > + _("Reparenting of external snapshots is not implemented")); > + rep->err = -1; > + return; > + } Why not? If I have the snapshot chain: internal_1 <- internal_2 <- external and delete just internal_2, why can't I reparent external to point to internal_1? I'm wondering if the memory corruption is happening here by returning failure in part of the tree, while the rest of the tree assumes success, so that the end result is a corrupted tree that points into free'd storage. > + > VIR_FREE(snap->def->parent); > snap->parent = rep->parent; > > @@ -12433,10 +12440,20 @@ static int qemuDomainSnapshotDelete(virDomainSnapshotPtr snapshot, > virDomainSnapshotForEachDescendant(snap, > qemuDomainSnapshotCountExternal, > &external); > - if (external) { > + if (external && > + virDomainObjIsActive(vm)) { > virReportError(VIR_ERR_CONFIG_UNSUPPORTED, > - _("deletion of %d external disk snapshots not " > - "supported yet"), external); > + _("deletion of %d external disk snapshots is " > + "supported only on inactive guests"), external); > + goto cleanup; > + } Like I said earlier, this may be a reasonable first round, since we can't do live blockcommit yet. However, I think we're missing a bigger picture here. Remember, with internal snapshots, the way things work is that the same file holds both the snapshot (a point in time) and the delta (all changes since that time). Right now, the action of deleting a snapshot says to forget about the point in time, but preserve all changes since that point in time, if those changes belong to either the current state or to another branch of the snapshot hierarchy. The only time data is actually deleted is if it is not in current use and there is no other snapshot remaining that can get back to the data. But above, you just blindly unlink()d the qcow2 wrapper file, which in effect discarded the delta (all changes since the snapshot), and _reverted_ us back to the point in time where the snapshot was taken. Probably not what the user meant; or if it is, this is a NEW mode of operation, so it should only be allowed if the user passes in a new flag, maybe named VIR_DOMAIN_SNAPSHOT_DELETE_REVERT, which says to delete the external snapshot files _and_ revert to the state at the time it was taken. In fact, this particular mode of operation might fit better with virDomainRevertToSnapshot instead of virDomainSnapshotDelete (again, controlled by a flag), but let's first come to an agreement on the scenarios that users might want... 1. I created a disk snapshot, and it created new qcow2 files. Now I no longer need to go to that point in time, but I want to keep the qcow2 files, without losing my changes since the snapshot. Solution: 'virsh snapshot-delete --metadata' merely forget that we took the snapshot, but keep the backing chain as-is. 2. I created a disk snapshot, but now I no longer need to go back, and I want my disk chain as short as it was before I created the snapshot, and without losing my changes since the snapshot. Solution: use 'qemu-img commit' to squash the delta back into the original file, then delete the backing file as no longer necessary. If any further snapshots depended on the backing file, then they also need a touchup (reparenting) to deal with the change in the backing chain, or else they must also be deleted. 'qemu-img commit' can only be run while the disk is offline; worse, it is time-consuming, so we now have to deal with the fact that deleting a snapshot probably ought to be an async job with status updates, rather than a blocking job. But if we implement this solution, we can also use it for 'virsh blockcommit' for offline commits. This mode most closely matches what we do when deleting an internal snapshot. 3. I created an external checkpoint, and now I no longer need to go that point in time. Solution: Delete the memory state file, then all we are left with is a disk snapshot, so we can now use scenario 1 or 2 to clean up the rest. 4. I created a disk snapshot, then did some experiments, and decided I don't need the result and want to go back to that point in time without remembering the experiments. Solution: combination revert and delete, where we reset the domain to use the backing file and delete the qcow2 wrapper files. 5. I created an external checkpoint of a running machine, and now want to go back to that point in time. Solution: like 4, this is a combination revert and delete, where we delete qcow2 wrapper files, but we also boot using the saved memory image and backing files. 6. I created an external checkpoint, did some experiments, and want to remember those results; but I also want to go back to the checkpoint and do some more experiments on a different branch. Solution: here, we need the ability to create NEW qcow2 files that also wrap the common base image. Since virDomainRevertToSnapshot doesn't have a way for us to pass in the new file names we wish to use, this needs to go through virDomainSnapshotCreateXML, which needs a new flag that says that we are simultaneously reverting to the state of an existing snapshot and creating a new snapshot from that point rather than at the current state of execution. 7. Once we allow creation of snapshot forks from a common base point, we need to worry about deleting one fork but keeping other forks alive - for an external checkpoint, the memory file is now effectively shared by multiple snapshots, and must not be deleted until the last snapshot referencing the file is deleted. 8. It is also possible to use multiple snapshots to create a longer chain, such as 'base <- snap1 <- snap2 <- current', and then delete snap1 while still remembering snap2. Right now, we can use blockcommit while qemu is live (and a sequence of qemu-img commit and qemu-img rebase calls for offline) to merge snap1 into base; we could also use blockpull (via qemu-img rebase while offline, but currently lacking support if qemu is running) to pull snap1 into snap2. That is, forbidding the user from deleting an external snapshot unless they also delete descendants is not necessarily desirable. I guess I've turned this more into a brain dump of where we should be headed - there is really a lot of functionality involved in reverting to external files, and we should be exposing choices to the user whether to pull into current, push into backing, discard files, or create parallel branches from the same starting point; and solutions to some of these issues touch not only snapshot deletion, but also snapshot revert and snapshot create. For this particular patch, though, I think we definitely want to introduce a new flag that says when we are doing a combined revert and delete, so that we can leave the option of deletion without losing current state (similar to how internal snapshot deletion works) until a later patch where we use qemu-img commit to do the work. -- Eric Blake eblake@xxxxxxxxxx +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list