ping On 14.11.2017 18:38, Nikolay Shirokovskiy wrote: > Table of contents. > > I Preface > > 1. Fleece API > 2. Export API > 3. Incremental backups > 4. Other hypervisors > > II Links > > > > > I Preface > > This is a RFC for external (or pull) backup API in libvirt. There was a series [1] > with more limited API scope and functionality for this kind of backup API. > Besides other issues the series was abandoned as qemu blockdev-del command has > experimental status at that time. There is also a long pending RFC series for > internal (or push) backup API [2] which however has not much in comman with > this RFC. Also there is RFC with overall agreement to having a backup API in > libvirt [3]. > > The aim of external backup API is to provide means for 3d party application to > read/write domain disks as block devices for the purpuse of backup. Disk is > read on backup operation and in case of active domain is presented at some > point in time (preferable in some guest consistent state). Disk is written on > restore operation. > > As to providing disk state at some point in time one can use existing disks > snapshots for this purpose. However this RFC introduces API to leverage image > fleecing (blockdev-backup command) instead. Image fleecing is somewhat inverse > to snapshots. In case of snapshots writes go to top image thus backing image > stays constant, in case of fleecing writes go to same image as before but old > data is previously popped out to fleece image which have original image as > backing. As a result fleece image became disk snapshot. > > Another task of this API is to provide disks for read/write operations. One > could try to leverage libvirt stream API for this purpose but AFAIK clients > want random access to disks data which is not what stream API suitable for. > I'm not sure what is costs of adding block API to libvirt, particularly what it > costs to make it effective implementation at RPC level thus this RFC add means > to export disks data thru existing block interfaces. For qemu it is NBD. > > > > 1. Fleece API > > So the below API is to provide means to start/stop/query disk image fleecing. > I use BlockSnaphost name for this operation. Other options are Fleecing, BlockFleecing, > TempBlockSnapshot etc. > > /* Start fleecing */ > virDomainBlockSnapshotPtr > virDomainBlockSnapshotCreateXML(virDomainPtr domain, > const char *xmlDesc, > unsigned int flags); > > /* Stop fleecing */ > int > virDomainBlockSnapshotDelete(virDomainBlockSnapshotPtr snapshot, > unsigned int flags); > > /* List active fleecings */ > virDomainBlockSnapshotList(virDomainPtr domain, > virDomainBlockSnapshotPtr **snaps, > unsigned int flags); > > /* Get fleecing description */ > char* > virDomainBlockSnapshotGetXMLDesc(virDomainBlockSnapshotPtr snapshot, > unsigned int flags); > > /* Get fleecing by name */ > virDomainBlockSnapshotPtr > virDomainBlockSnapshotLookupByName(virDomainPtr domain, > const char *name); > > > Here is a minimal block snapshot xml description to feed creating function: > > <domainblocksnapshot> > <snapshot disk='sda'> > <fleece file="/path/to/fleece-image-sda"/> > </snapshot> > <snapshot disk='sdb'> > <fleece file="/path/to/fleece-image-sdb"/> > </snapshot> > </domainblocksnapshot> > > Below is an example of what getting description function should provide upon > successful block snaphost creation. The difference with the above xml is that > name element (it can be specified on creation as well) and aliases are > generated. Aliases will be useful later to identify block devices on exporting > thru nbd. > > <domainblocksnapshot> > <name>5768a388-c1c4-414c-ac4e-eab216ba7c0c</name> > <snapshot disk='sda'> > <fleece file="/path/to/fleece-image-sda"/> > <alias name="scsi0-0-0-0-backup"/> > </snapshot> > <snapshot disk='sdb'> > <fleece file="/path/to/fleece-image-sdb"/> > <alias name="scsi0-0-0-1-backup"/> > </snapshot> > </domainblocksnapshot> > > > > 2. Export API > > During backup operation we need to provide read access to fleecing image. This > is done thru qemu process nbd server. We just need to specify the disks to > export. > > /* start block export */ > int > virDomainBlockExportStart(virDomainPtr domain, > const char *xmlDesc, > unsigned int flags); > > /* stop block export */ > int > virDomainBlockExportStop(virDomainPtr domain, > const char *diskName, > unsigned int flags); > > Here is an example of xml for starting function: > > <blockexport type="nbd" port="8001"> > <listen type="address" address="10.0.2.10"/> > <disk name="scsi0-0-0-1-backup"/> > </blockexport> > > qemu nbd server is started upon first disk export start and shutted down upon > last disk export stop. Another option is to control ndb server explicitly. One > way to do it is to consider ndb server a new device so to start/stop/update ndb > server we can use attach/detach/update device functions. Then in block export > start we need to refer to this device somehow. This can be a generated > name/uuid or type/address pair. Actually this approach to expose ndb server > looks more natural to me even it includes more management from client side. > I am not suggesting it in the first place mostly due to hesitations on how to > refer to ndb server on block export. > > In any case I'd like to provide export info in active domain config: > > <devices> > <blockexport type="nbd" port="8001"> > <listen type="address" address="10.0.2.10"/> > <disk name="scsi0-0-0-1-backup"/> > <disk name="scsi0-0-0-2-backup"/> > </blockexport> > </devices> > > This API is used in restore operation too. Domain is started in paused state, > the disks to be restored are exported and backup client fills it with the > backup data. > > > > 3. Incremental backups > > Qemu can track what disk parts are changed from from fleecing start. This is > what typically called CBT (dirty bitmap in qemu community I guess). There are > also experimental ndb support [4] and a bunch of merged/agreed/proposed bitmap > operation that help to organize incremental backups. > > Different hypervisors has different bitmap implementations with different > costs thus it is up to hyperivsor whether to start CBT or not upon block snapshot > create by default. Qemu implementations has memory and disk costs for every > bitmap thus I suggest by default start fleecing without bitmap and add flag > VIR_DOMAIN_BLOCK_SNAPSHOT_CREATE_CHECKPOINT to ask to start a bitmap. > > Disks bitmaps are visible in active domain definition with the name > of block snapshot for which bitmap was started. > > <disk type='file' device='disk'> > .. > <target dev='sda' bus='scsi'/> > <alias name='scsi0-0-0-0'/> > <checkpoint name="93a5c045-6457-2c09-e56c-927cdf34e178"> > <checkpoint name="5768a388-c1c4-414c-ac4e-eab216ba7c0c"> > .. > </disk> > > The bitmap can be specified upon disk export like below (I guess there > is no need to provide more then one bitmap per disk). Active domain > config section for block export is expanded similarly. > > <blockexport type="nbd" port="8001"> > <listen type="address" address="10.0.2.10"/> > <disk name="scsi0-0-0-1-backup" checkpoint="5768a388-c1c4-414c-ac4e-eab216ba7c0c"/> > </blockexport> > > If bitmap was created on backup start but client failed to make a backup for some reason > then it makes no sense to keep this checkpoint anymore. As having bitmap takes > resources it is convinient to drop bitmap in this case. Also one may > want to drop bitmap for pure resource managment issues. So we need API to remove bitmap: > > virDomainBlockCheckpointRemove(virDomainPtr domain, > const char *name, > unsigned int flags); > > > > 4. Other hypervisors > > I took a somewhat considerable look only at vmware backup interface at [5] etc. > Looks like they don't have fleecing like qemu has so for vmware snapshots one > can use usual disks snapshots API. Also there is no nbd interface for snapshots > expectedly thus to deal with vmware snapshot disks one eventually will have to > add block API to libvirt. So the only point this RFC has to vmware backups is > exporting checkpoints in disk xml. The vmware documentation does not say much > about bitmap limitations but I guess they still can provide only a number of > them which can be exposed as suggested for active domain disks. > > > > II Links: > > [1] https://www.redhat.com/archives/libvir-list/2016-September/msg00192.html > [2] https://www.redhat.com/archives/libvir-list/2017-May/msg00379.html > [3] https://www.redhat.com/archives/libvir-list/2016-March/msg00937.html > [4] https://github.com/NetworkBlockDevice/nbd/commit/cfa8ebfc354b2adbdf73b6e6c2520d1b48e43f7a > [5] https://code.vmware.com/doc/preview?id=4076#/doc/vddkBkupVadp.9.3.html#1014717 > > -- > libvir-list mailing list > libvir-list@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/libvir-list > -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list