On 4/24/19 8:50 AM, Peter Krempa wrote: > On Wed, Apr 17, 2019 at 09:09:04 -0500, Eric Blake wrote: >> Introduce a few new public APIs related to incremental backups. This >> builds on the previous notion of a checkpoint (without an existing >> checkpoint, the new API is a full backup, differing from >> virDomainBlockCopy in the point of time chosen and in operation on >> multiple disks at once); and also allows creation of a new checkpoint >> at the same time as starting the backup (after all, an incremental >> backup is only useful if it covers the state since the previous >> backup). Snapshot creation is also a point in time at which creating >> a checkpoint atomically can be useful; as checkpoints are independent >> objects, it is not worth embedding <domaincheckpoint> inside >> <domainsnapshot>, and therefore we need a more powerful version of >> virDomainSnapshotCreateXML(), where we borrow from the naming pattern >> of virDomainMigrate2() and friends. >> >> A backup job also affects filtering a listing of domains, as well as >> adding event reporting for signaling when a push model backup >> completes (where the hypervisor creates the backup); note that the >> pull model does not have an event (starting the backup lets a third >> party access the data, and only the third party knows when it is >> finished). >> >> Since multiple backup jobs can be run in parallel in the future (well, >> qemu doesn't support it yet, but we don't want to preclude the idea), >> virDomainBackupBegin() returns a positive job id, and the id is also >> visible in the backup XML. But until a future libvirt release adds a >> bunch of APIs related to parallel job management where job ids will >> actually matter, the documentation is also clear that job id 0 means >> the 'currently running backup job' (provided one exists), for use in >> virDomainBackupGetXMLDesc() and virDomainBackupEnd(). >> >> The full list of new APIs: >> virDomainBackupBegin; >> virDomainBackupEnd; >> virDomainBackupGetXMLDesc; >> virDomainSnapshotCreateXML2; >> >> Signed-off-by: Eric Blake <eblake@xxxxxxxxxx> >> Reviewed-by: Daniel P. Berrangé <berrange@xxxxxxxxxx> >> --- >> include/libvirt/libvirt-domain-snapshot.h | 8 +- >> include/libvirt/libvirt-domain.h | 41 +++- >> src/driver-hypervisor.h | 21 +++ >> src/qemu/qemu_blockjob.h | 1 + >> examples/object-events/event-test.c | 3 + >> src/conf/domain_conf.c | 2 +- >> src/libvirt-domain-snapshot.c | 89 +++++++++ >> src/libvirt-domain.c | 219 ++++++++++++++++++++++ >> src/libvirt_public.syms | 4 + >> tools/virsh-domain.c | 8 +- >> 10 files changed, 390 insertions(+), 6 deletions(-) > > [...] > >> >> @@ -78,6 +78,12 @@ virDomainSnapshotPtr virDomainSnapshotCreateXML(virDomainPtr domain, >> const char *xmlDesc, >> unsigned int flags); >> >> +/* Take a snapshot of the current VM state, possibly creating a checkpoint */ >> +virDomainSnapshotPtr virDomainSnapshotCreateXML2(virDomainPtr domain, >> + const char *xmlDesc, >> + const char *checkpointXml, >> + unsigned int flags); >> + > > This feels weird in this patch. It does not deal with backup in any way. It would > be better to have this separately or at least together with checkpoint > creating. Particularly true if we decide that the XML for a domain snapshot needs to track which checkpoint was active at the time the snapshot was created, as well as adding a counterpart API for virDomainSnapshotRevert() in order to create a new bitmap at the time of a revert action. And since I have not actually reached the point of demonstrating this API in action in the qemu driver, I agree with pulling it out of this patch, and instead saving it for a separate API patch that deals with any other API additions (such as virDomainSnapshotRevert2(), if needed) when we do figure out how we want to support snapshots and checkpoints on the same domain, even if it is not in the same release as the initial incremental backup support. >> +typedef int >> +(*virDrvDomainBackupEnd)(virDomainPtr domain, int id, unsigned int flags); > > Since the getter APIs are overloaded on top of shouldn't this be > overloaded with qemuDomainAbortJob as well? > Possibly - except virDomainAbortJob() takes no flags and no job id. So the best I can do is teach virDomainAbortJob(dom) to be a synonym for virDomainBackupEnd(dom, 0, VIR_DOMAIN_BACKUP_END_ABORT) to abort an ongoing backup, but only when there is exactly one ongoing backup, and only when there are no other parallel jobs (at least the initial implementation says that backup and migration are mutually exclusive, so that is not too hard to figure out). Down the road, when we finally figure out the best APIs to add to retrofit DECENT parallel job support into all of our existing jobs, then yes, a single job API should be able to do everything that virDomainBackupEnd() can do. >> +++ b/src/qemu/qemu_blockjob.h >> @@ -55,6 +55,7 @@ typedef enum { >> QEMU_BLOCKJOB_TYPE_COPY = VIR_DOMAIN_BLOCK_JOB_TYPE_COPY, >> QEMU_BLOCKJOB_TYPE_COMMIT = VIR_DOMAIN_BLOCK_JOB_TYPE_COMMIT, >> QEMU_BLOCKJOB_TYPE_ACTIVE_COMMIT = VIR_DOMAIN_BLOCK_JOB_TYPE_ACTIVE_COMMIT, >> + QEMU_BLOCKJOB_TYPE_BACKUP = VIR_DOMAIN_BLOCK_JOB_TYPE_BACKUP, > > Here it also tries to behave like a block job. That might be a problem > if a user tries to virDomainBlockJobAbort it as you can't do that > partially. See at the end for more explanation. Life gets interesting when you are doing a backup of multiple disks at once: libvirt is telling qemu to spawn multiple jobs (one job per disk) under the hood, but treating the overall backup operation as a single job. So you are correct that care has to be taken in that if you query the status of a block job for a particular disk, you can learn whether that particular disk is still being backed up or whether it has completed (even of the overall backup job is still ongoing because of other disks), while aborting the blockjob should either fail (can't do one disk but only the entire backup job) or else serve as a way to abort the entire job (not only are you cancelling the one disk, but all other disks in the backup job). I'm not sure which of the two ideas are nicer, but it's probably easier to just state that virDomainBlockJobAbort() on a backup job will fail. > > > [...] > >> diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c >> index 44438c3be8..67ae2a9ecd 100644 >> --- a/src/libvirt-domain.c >> +++ b/src/libvirt-domain.c >> @@ -10351,6 +10351,12 @@ virDomainBlockRebase(virDomainPtr dom, const char *disk, >> * over the destination format, the ability to copy to a destination that >> * is not a local file, and the possibility of additional tuning parameters. >> * >> + * The copy created by this API is not finalized until the job ends, >> + * and does not lend itself to incremental backups (beyond what >> + * VIR_DOMAIN_BLOCK_COPY_SHALLOW provides) nor to third-party control >> + * over the data being copied. For those features, use >> + * virDomainBackupBegin(). > > I'd not mention this at all. This paragraph is useful information, but independent of backups. It could be split into its own patch, if anyone agrees that it is useful text; otherwise, I guess it can be dropped. > >> + * >> * Returns 0 if the operation has started, -1 on failure. >> */ >> int > > In general I'm worried about the design semantics here. I see three > distinct pattern happening: > > 1) The backup tries to behave as a domain job for querying and event This was the approach I was aiming for. > 2) The backup tries to behave as a separate kind of job for ending it virDomainAbortJob() lacks a flags field, so we need an API added for full control; but as I said above, it should not be too hard to at least let virDomainAbortJob() map to a specific version of virDomainBackupEnd() if people are already used to using domain job queries to track the progress of the job (the progress matters more for push-mode backup than for pull-mode). > 3) The backup is also documented as a block job. Or rather, one or more block jobs are used under the hood, and thus you can use block job queries to track the progress of a particular disk within the larger backup job (and until we do support parallel jobs, it also means that a backup job probably locks you out from performing other block jobs). > > Intermixing all of these kinds will create "interresting" semantics. The > domain job can be aborted using virDomainAbortJob API while the backup > can't but it can be queried as such. The backup also is listed as a > block job but virDomainBlockJobAbort can handle only a single > disk-related job, but a backup spans multiple disks. > > Also the stats reporting will break as soon as multiple jobs are > supported as it overloads virDomainGetJobStats. > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list