On Tue, Aug 31, 2021 at 16:46:25 +0300, Nikolay Shirokovskiy wrote: > Hi, all. Hi, sorry for the late reply I was on PTO. > I want to implement reverting to external snapshot functionality. > I've checked the mailing list history and found some previous attempts > (not sure if this is a complete list of them). > > [1] was done in 2012 by the Redhat team itself. [2] and [3] was done in > 2018. > Looks like it was not clear how the API should look like. One additional thing is that me and phrdina started discussing this (in person so I can't point you to a discussion) 2 weeks ago. I'll summarize the points we agreed upon. > For example we have disk.qcow2 <- disk.snap1 chain <- disk.snap2 chain > (after > having snap1 and snap2 snapshots). Now we want to revert to snap1 snapshot. There's one implementation snag we currently have which complicates stuff. Let's expand your above scenario with the snapshot states: s s n n a a p p 1 2 p o │ │ r r base.qcow2 │ snap1.qcow2 │ snap2.qcow2 e i ───────────────────────► │ ────────────────────►│ ────────────────────► s g │ │ e i │ │ n n t A rather big set of problems which we will necessarily encounter when implementing this comes from the following: 1) users can modify the VM betwen snapshots or between the last snapshot and the abandoned state and this modification (e.g. changing a disk image) must be considered to prevent data loss when manipulating the images. 2) libvirt doesn't record the XML used to create the snapshot (e.g. snap1.xm) thus we don't actually know what the (historical) snapshot was actually recording. Coupled with the fact that the user could have changed the VM definition between 'snap1' and 'snap2' we can't even infer what the state was. > The > snapshot state is held by disk.qcow2 image. We can run reverted domain on > disk.qcow2 itself but then we lose snap1 (held by disk.qcow2) and snap2 > (held by disk.snap1). So we definitely should run on some overlay over > disk.qcow2. But then what name should be given to overlay? We should have > an option for mgmt to specify this name like in case of snapshots itself. Exactly. Reversion of external snapshots will necessarily require a new API, which will take a new "snapshot" XML describing the new overlays as you describe below. In the simple case such as with local files we can use the same algorithm for creating overlay filenames as we do when creating snapshots but generally we need to give the MGMT the ability to specify the overlay name. > The [1] has some discussion on adding extra options to reverting > functionality. > Like for example if we revert back to snap2 then having the ability to run > from > state in disk.snap2 rather than disk.snap1. My opinion is there is no need > to > as if one wants to revert to the state of disk2.snap2 it can take a > snapshot (some > snap3). It's possible to avoid doing a combined "take snapshot&revert" operation as long as we have the possibility to take a snapshot and destroy the VM as running it after the point you want to return to is undesirable and pointless. One thing that you need to consider here is that when you are reverting to an arbitrary snapshot, the overlay files after the last snapshot (e.g. snap2.qcow2 (in my diagram), which is the state between snap2 and present) are abandoned and will never be used again. At this point we disucssed that we should be removing those as semantically it's the only point where we can do that as we know the state of the VM which will be abandoned. The problem lies in the fact that between 'snap2' and present the user could have exchanged disks and then we don't have enough metadata to create the overlays. One big additional caveat is that if the user exchanged disk images we _must not_ delete the changed image, so the metadata wrangling might be non-trivial in this case. > At the same time one needs to be aware that revert operation loses > current state and later one can revert only to the states of snapshots. > This is the way internal snapshots work and the way one expects external > snapshots to work too. That is an acceptable caveat for the user. As noted above as long as you can take a snapshot & atomicaly destroy the VM it's acceptable. For libvirt it's harder a bit as described above especially if we don't want to keep litering the disk with unused and invalid images. > The [2] takes an approach of reusing current top image as overlay on revert > so > that in the above example disk.snap2 will be overlay over disk.qcow2 on > reverting to snap1 snapshot. IMHO this is a confusing naming scheme. No in our discussion with Pavel we've ruled out all special cases as too much hassle to implement with little benefit. If a user wants to rever&abandon some state, they can do 2 operations of reverting and then deleting the unwanted snapshots. > > The [3] suggests a different scheme for naming images. For example after > taking > snapshot snap1 the chain should be disk.snap1 <- disk.qcow2 which looks very > appealing to me. With this naming using the [2] approach is quite natural. IMO we can't really change the naming any more, users baked scripts on top of this. Using a flag is also mostly pointless as users will keep creating snapshots without it. Thus I don't think this is a viable approach. The only advantage it would have that it would be simpler to vizualize what happened, but ideally users would not need to care about the steps and filenames at all. > Implementing this does not look hard even for a running domain but this is > a big change to API and all mgmt need to be aware of (well it could be done > safely using a new flag). Yup, too much hassle, questionable outcome. Not worth it. > > Anyway we can go on with current image names. In order to specify overlay > image name let's introduce new API: > > int virDomainRevertToSnapshotXML(virDomainSnapshotPtr snapshot, > char *xmlDesc, > unsigned int flag I was briefly considering that the new snapshot api should also have provisions for specifying a snapshot XML to take the snapshot of the abandoned state, but as discussed it's not really necessary if you can do it in 2 steps. > with XML like: > > <domainsnapshotrevert> > <disks> > <disk name='vda'> > <source file='/path/to/revert/overlay'/> > </disk> > </disks> > </domainsnapshotrevert> > > Having an XML looks like a bit overkill right now but I could not > find a better solution. I don't see any other option. In case you have e.g. networked storage where we can't figure out stuff by ourselves you basically don't have other option. > If overlay name is omitted the generated name will be like disk.snap1-1 in > the > above example to be in alignment with creating a snapshot case. So that > disk.snap1* > images hold states derived from snap1 state. We can also support reverting > to > external snapshot thru existing virDomainRevertToSnapshot for those who > rely on > generated names using this naming scheme. Yes, this would be possible, but quite fragile. But with regular snapshots without any magic it should work. One thing you've missed though is that deletion of snapshots now becomes quite a challenge. s s s n n n a a a p p p 1 2 3 o │ │ │ r base.qcow2 │ snap1.qcow2 │ snap2.qcow2 │ i ─────────────────► │ ────────────►│ ────────────►│ g │ │ │ i │ │ │ n │ │ p │ r │ alternatehistory.qcow2 e │ ────────────────────────────────► s │ e │ n t Specifically the premise is that we don't want to keep unnecessary images (e.g. the trivial solution for deleting 'snap1' would be to remove the metadata and keep base.qcow2 intact) as without the metadata it will become very hard to determine which images are actually still used. So in the above scenario: - User wants to delete snap3: 'snap2.qcow2' overlay becomes unreachable and should be deleted. - User wants to delete snap2: There are 2 options in this case: 1) block-pull snap1 into snap2 2) block-commit snap2 into snap1 The optimal solution is the one that transfers less data, but in the end either of them is equally good in terms of saved storage. - User wants to delete snap1: This is a bit harder: 1) commit into base.qcow2 would invalidate 'alternatehistory.qcow2' 2) pull of base into both snap1.qcow2 and alternatehistory.qcow2 is possible, but may yield big images 3) the last option is to forbid deletion of snapshots that has alternate histories The first thing we should do though is to forbid creation of mixed trees of internal and external snapshots because the matrix of stuff to handle would explode way beyond this already rather complex situation.