Table of contents. I Preface 1. Fleece API 2. Export API 3. Incremental backups 4. Other hypervisors II Links I Preface This is a RFC for external (or pull) backup API in libvirt. There was a series [1] with more limited API scope and functionality for this kind of backup API. Besides other issues the series was abandoned as qemu blockdev-del command has experimental status at that time. There is also a long pending RFC series for internal (or push) backup API [2] which however has not much in comman with this RFC. Also there is RFC with overall agreement to having a backup API in libvirt [3]. The aim of external backup API is to provide means for 3d party application to read/write domain disks as block devices for the purpuse of backup. Disk is read on backup operation and in case of active domain is presented at some point in time (preferable in some guest consistent state). Disk is written on restore operation. As to providing disk state at some point in time one can use existing disks snapshots for this purpose. However this RFC introduces API to leverage image fleecing (blockdev-backup command) instead. Image fleecing is somewhat inverse to snapshots. In case of snapshots writes go to top image thus backing image stays constant, in case of fleecing writes go to same image as before but old data is previously popped out to fleece image which have original image as backing. As a result fleece image became disk snapshot. Another task of this API is to provide disks for read/write operations. One could try to leverage libvirt stream API for this purpose but AFAIK clients want random access to disks data which is not what stream API suitable for. I'm not sure what is costs of adding block API to libvirt, particularly what it costs to make it effective implementation at RPC level thus this RFC add means to export disks data thru existing block interfaces. For qemu it is NBD. 1. Fleece API So the below API is to provide means to start/stop/query disk image fleecing. I use BlockSnaphost name for this operation. Other options are Fleecing, BlockFleecing, TempBlockSnapshot etc. /* Start fleecing */ virDomainBlockSnapshotPtr virDomainBlockSnapshotCreateXML(virDomainPtr domain, const char *xmlDesc, unsigned int flags); /* Stop fleecing */ int virDomainBlockSnapshotDelete(virDomainBlockSnapshotPtr snapshot, unsigned int flags); /* List active fleecings */ virDomainBlockSnapshotList(virDomainPtr domain, virDomainBlockSnapshotPtr **snaps, unsigned int flags); /* Get fleecing description */ char* virDomainBlockSnapshotGetXMLDesc(virDomainBlockSnapshotPtr snapshot, unsigned int flags); /* Get fleecing by name */ virDomainBlockSnapshotPtr virDomainBlockSnapshotLookupByName(virDomainPtr domain, const char *name); Here is a minimal block snapshot xml description to feed creating function: <domainblocksnapshot> <snapshot disk='sda'> <fleece file="/path/to/fleece-image-sda"/> </snapshot> <snapshot disk='sdb'> <fleece file="/path/to/fleece-image-sdb"/> </snapshot> </domainblocksnapshot> Below is an example of what getting description function should provide upon successful block snaphost creation. The difference with the above xml is that name element (it can be specified on creation as well) and aliases are generated. Aliases will be useful later to identify block devices on exporting thru nbd. <domainblocksnapshot> <name>5768a388-c1c4-414c-ac4e-eab216ba7c0c</name> <snapshot disk='sda'> <fleece file="/path/to/fleece-image-sda"/> <alias name="scsi0-0-0-0-backup"/> </snapshot> <snapshot disk='sdb'> <fleece file="/path/to/fleece-image-sdb"/> <alias name="scsi0-0-0-1-backup"/> </snapshot> </domainblocksnapshot> 2. Export API During backup operation we need to provide read access to fleecing image. This is done thru qemu process nbd server. We just need to specify the disks to export. /* start block export */ int virDomainBlockExportStart(virDomainPtr domain, const char *xmlDesc, unsigned int flags); /* stop block export */ int virDomainBlockExportStop(virDomainPtr domain, const char *diskName, unsigned int flags); Here is an example of xml for starting function: <blockexport type="nbd" port="8001"> <listen type="address" address="10.0.2.10"/> <disk name="scsi0-0-0-1-backup"/> </blockexport> qemu nbd server is started upon first disk export start and shutted down upon last disk export stop. Another option is to control ndb server explicitly. One way to do it is to consider ndb server a new device so to start/stop/update ndb server we can use attach/detach/update device functions. Then in block export start we need to refer to this device somehow. This can be a generated name/uuid or type/address pair. Actually this approach to expose ndb server looks more natural to me even it includes more management from client side. I am not suggesting it in the first place mostly due to hesitations on how to refer to ndb server on block export. In any case I'd like to provide export info in active domain config: <devices> <blockexport type="nbd" port="8001"> <listen type="address" address="10.0.2.10"/> <disk name="scsi0-0-0-1-backup"/> <disk name="scsi0-0-0-2-backup"/> </blockexport> </devices> This API is used in restore operation too. Domain is started in paused state, the disks to be restored are exported and backup client fills it with the backup data. 3. Incremental backups Qemu can track what disk parts are changed from from fleecing start. This is what typically called CBT (dirty bitmap in qemu community I guess). There are also experimental ndb support [4] and a bunch of merged/agreed/proposed bitmap operation that help to organize incremental backups. Different hypervisors has different bitmap implementations with different costs thus it is up to hyperivsor whether to start CBT or not upon block snapshot create by default. Qemu implementations has memory and disk costs for every bitmap thus I suggest by default start fleecing without bitmap and add flag VIR_DOMAIN_BLOCK_SNAPSHOT_CREATE_CHECKPOINT to ask to start a bitmap. Disks bitmaps are visible in active domain definition with the name of block snapshot for which bitmap was started. <disk type='file' device='disk'> .. <target dev='sda' bus='scsi'/> <alias name='scsi0-0-0-0'/> <checkpoint name="93a5c045-6457-2c09-e56c-927cdf34e178"> <checkpoint name="5768a388-c1c4-414c-ac4e-eab216ba7c0c"> .. </disk> The bitmap can be specified upon disk export like below (I guess there is no need to provide more then one bitmap per disk). Active domain config section for block export is expanded similarly. <blockexport type="nbd" port="8001"> <listen type="address" address="10.0.2.10"/> <disk name="scsi0-0-0-1-backup" checkpoint="5768a388-c1c4-414c-ac4e-eab216ba7c0c"/> </blockexport> If bitmap was created on backup start but client failed to make a backup for some reason then it makes no sense to keep this checkpoint anymore. As having bitmap takes resources it is convinient to drop bitmap in this case. Also one may want to drop bitmap for pure resource managment issues. So we need API to remove bitmap: virDomainBlockCheckpointRemove(virDomainPtr domain, const char *name, unsigned int flags); 4. Other hypervisors I took a somewhat considerable look only at vmware backup interface at [5] etc. Looks like they don't have fleecing like qemu has so for vmware snapshots one can use usual disks snapshots API. Also there is no nbd interface for snapshots expectedly thus to deal with vmware snapshot disks one eventually will have to add block API to libvirt. So the only point this RFC has to vmware backups is exporting checkpoints in disk xml. The vmware documentation does not say much about bitmap limitations but I guess they still can provide only a number of them which can be exposed as suggested for active domain disks. II Links: [1] https://www.redhat.com/archives/libvir-list/2016-September/msg00192.html [2] https://www.redhat.com/archives/libvir-list/2017-May/msg00379.html [3] https://www.redhat.com/archives/libvir-list/2016-March/msg00937.html [4] https://github.com/NetworkBlockDevice/nbd/commit/cfa8ebfc354b2adbdf73b6e6c2520d1b48e43f7a [5] https://code.vmware.com/doc/preview?id=4076#/doc/vddkBkupVadp.9.3.html#1014717 -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list