On 2/18/19 10:34 AM, Peter Krempa wrote: >>> Among other things, I still haven't determined how we want to >>> integrate checkpoints with external snapshots; we could either have: >>> >>> virDomainSnapshotCreateXML("<domainsnapshot>...") # existing >>> virDomainBackupBegin("<domainbackup>...", "<domaincheckpoint>...") # this series >>> virDomainSnapshotCheckpointCreateXML("<domainsnapshot>...", "<domaincheckpoint>...") # new >> >> A slightly related question: when these new APIs (thanks for working on >> them!) are merged, am I right in assuming that they should be able to >> "replace" the existing (and provide additional): >> >> virDomainBlockRebase(); and >> virDomainBlockCopy() >> >> ... _provided_ that an application is adjusted to using libvirt that is >> new enough to drive QEMU's '-blockdev', QMP `blockdev-add` et al? >> >> Or is that (the new APIs being backward-compatible with blockRebase() >> and blockCopy()) an explicit non-goal? >> >> I'm only this out of curiosity. > > The checkpoints are really orthogonal to the backing chain/shapshot > managemet. Indeed. Think of virDomainBlockRebase() as syntactic sugar (it really is a wrapper around either virDomainBlockPull() streaming operation, or around virDomainBlockRebase() for a mirror operation). Given your question grouping it with virDomainBlockCopy(), I'm assuming you are only asking about the latter. The biggest difference between virDomainBlockCopy() and virDomainBackupBegin() is point-in-time: with mirroring, you start the job up front, but you do not have a valid backup image until you cancel the job; but since both files have the same content (once the job has hit the sync phase), cancellation gives you a choice of whether to stay with the old file (the mirror image is the backup) or to pivot to the new file (the original file is now the backup). On the other hand, with virDomainBackupBegin(), you select the point-in-time at the point where you start the job, and the backup copy is created independently of the running image so there is no pivoting possible. virDomainBackupBegin() will work even without checkpoints (in which case, it is a full image backup); but their main power is that when used WITH checkpoints, the backup operation can be done with much less effort than a full copy. Ideally, we want external snapshots to also be points where checkpoints can be created; John Snow and I have had some ideas about what is needed, but our focus is first getting the API working without worrying about external snapshots, while ensuring that the XML has enough flexibility to add in those improvements later without needing more API. Another difference: virDomainBlockCopy() operates on only one block device at a time (you have to issue multiple calls if your guest has multiple disks, although the calls can at least be run in parallel). But virDomainBackupBegin() operates on the entire domain at once, with your choice of granularity on which disk(s) to have involved (similar to how virDomainSnapshotCreateXML() lets you choose which disks to snapshot). > > Checkpoints don't really store any data but rather provide a way to > determine which blocks were changed and thus need backup. Also one point > is that checkpoints don't allow (or I did not notice it in the proposal) > to capture memory state along with the disk state. Correct - checkpoints ONLY track which portions of the disk have changed. In fact, when taking a differential backup, you can really only request all changes occurring between one point in the past and the present. In particular, if you have Check1 and Check2, you can request the backup of all sectors touched between Check1 and the present, or all sectors touched between Check2 and the present, but you cannot request the backup of only the sectors touched between Check1 and Check2 (at least not through the libvirt API), because those sectors may have changed again between Check2 and the present, and the bitmaps only requires which sectors have changed, and NOT what the data was at the time of Check2's creation. > In the ideal world of snapshots when deletion/revertion was implemented > we'd never expose the virDomainBlockCommit and virDomainBlockPull APIs > including the multi-use backdoor virDomainBlockRebase() which should > have never existed and users would do equivalent operations using the > snapshot APIs. > > virDomainBlockCopy is useful on it's own though but badly combines with > snapshots. This will need some fixing. Indeed, and that's true regardless of whether the backup API goes in (although the backup API probably compounds the issue on the number of corner cases we have to think about; the conservative approach is that at least in the beginning, you won't be able to run a BlockCopy and a BackupBegin job at the same time). > > With the new checkpoint APIs the situation is even more "fun" as > modification of the backing chain involves in some cases changes to the > bitmaps. Ideally these would do "the right thing (TM)" during snapshot > deletion/reversion. Given that at this time we don't support snapshot > deletion/reversion for external snapshots we can use the excuse that > snapshot management is not implemented so that checkpoints don't need > to be modified. Somewhat correct - but we DO have to think about how we plan for the API to grow in the future when we eventually DO fix snapshot deletion/reversion. Hence my question - would we rather have the creation of a checkpoint at the same time as the creation of an external snapshot (which we DO know we will want) to occur via the existing API (by extending the snapshot XML to include the checkpoint XML as a sub-element), or via a new API (by passing the checkpoint XML as a second parameter)? Once we've answered that question, it then determines what signature we want for virDomainBackupBegin() (either two separate XML parameters, one for the backup job and one for the checkpoint creation, as presented in v4 of the series, OR as one single XML call where the checkpoint XML is a sub-element of the backup XML). > > Since virDomainBlockCommit/virDomainBlockPull are basically a backdoor to do > snapshot merging which is not actually recorded in the snapshot XML the > same damage can happen to the checkpoints. > > I'm not sure how we want to deal with that at this point though. > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org