Consider the case of a guest that has multiple virtual disks, some residing on shared storage (such as the OS proper) and some on local storage (scratch space, where the OS has faster response if the virtual disk does not have to go over the network, and possibly one where the guest can still work even if the disk is hot-unplugged). During migration, you'd want different handling of the two disks (the destination can already see the shared disk, but must either copy the contents or recreate a blank scratch volume for the local disk). Or, consider the case where a guest has one disk as qcow2 (it is not modified frequently, and benefits from sharing a common backing file with other guests), while another disk is raw (for better read-write performance). Right now, 'virsh snapshot' fails, because it only works if all disks are qcow2; and in fact it may be the case that it is desirable to only take a snapshot of a subset of the domain's disks. So, I think we need some way to request an operation on a subset of VM disks, in a manner that can be shared between migration and volume management APIs. And I'm not sure it makes sense to add two more parameters to migration commands (an array of disks, and the size of that array), nor to modify the snapshot XML to describe which disks belong to the snapshot. So I'm thinking we need some sort of API set to manage a stateful set of disk operations. Maybe the trick is to define that every VM has a (possibly empty) set of selected disks, with APIs to manage moving a single disk in or out of the set, an API for listing the entire set, then a single flag to migration that states that live block migration is attempted for all disks currently in the VMs selected disk set. Being stateful, this would have to be represented in XML (so that if libvirtd is restarted, it remembers which disks are selected); I'm thinking of adding a new selected='yes|no' attribute to <disk>, as in: <disk type='file' device='disk' selected='yes'/> <driver name='qemu' type='raw'/> ... </disk> where if the attribute is absent, it defaults to no. For hypervisors where the state is maintained by libvirtd (qemu, lxc), the XML works; for other hypervisors, the notion of a subset of selected disks would have to just fail unless there is some hypervisor-specific way to track that information alongside a domain. For my API proposal, I'm including an unused flags argument to all the virDomainDiskSet* commands (experience has taught me well). In fact, we could even use that flags parameter, to maintain parallel sets (set 0 is the set of disks to migrate, set 1 is the set of disks to snapshot, ...), although I don't think we need that complexity yet (besides, it would affect the proposed XML). /* Add disk to the domain's set of selected disks; flags ignored for now; return 0 on success, 1 if already in the set, -1 on failure */ int virDomainDiskSetAdd(virDomainPtr dom, char *disk, unsigned int flags); /* Remove disk from the domain's set of selected disks; flags ignored for now; return 0 on success, 1 if already absent from the set, -1 on failure */ int virDomainDiskSetRemove(virDomainPtr dom, char *disk, unsigned int flags); /* Add all disks to the domain's set of selected disks; flags ignored for now; return 0 on success, -1 on failure */ int virDomainDiskSetAddAll(virDomainPtr dom, unsigned int flags); /* Remove all disks from the domain's set of selected disks; flags ignored for now; return 0 on success, -1 on failure */ int virDomainDiskSetRemoveAll(virDomainPtr dom, unsigned int flags); /* Return the size of the domain's currently selected disk set, or -1 on failure; flags ignored for now */ int virDomainDiskSetSize(virDomainPtr dom, unsigned int flags); /* Populate up to n entries of the array with the names of the domain's selected disk set, and return how many entries were populated, or -1 on failure; flags ignored for now */ int virDomainDiskSetList(virDomainPtr dom, char **array, int n, unsigned int flags) With API in place for tracking a subset of selected disks, we can then extend existing APIs with new flags: /* Old way - domain migration without any disks migrated */ virDomainMigrate(dom, dconn, flags | 0, dname, uri, bandwidth) /* New way - domain migration, including all disks in the domain's selected disk set being copied to the destination */ virDomainMigrate(dom, dconn, flags | VIR_MIGRATE_WITH_DISK_SET, dname, uri, bandwidth) /* Old way - snapshot of all disks */ virDomainSnapshotCreateXML(dom, xml, 0) /* New way - snapshot of just disks in selected disk set */ virDomainSnapshotCreateXML(dom, xml, VIR_DOMAIN_SAVE_DISK_SET) I'd also like to see some collaboration between virDomainSave (for memory) and virDomainSmapshotCreateXML (for disks); unfortunately, virDomainSave doesn't take a flags argument. Maybe this calls for a new API, and possibly a new version of the header to a 'virsh save' image to track the location of snapshotted disks alongside the saved memory state: /* Save the RAM state of domain to the base file "to". If "xml" is NULL, no disks are snapshotted. Otherwise, "xml" is a snapshot XML that describes how disk state will also be saved; if flags includes VIR_DOMAIN_SAVE_DISK_SET, then the domain's selected disk set is snapshotted, otherwise all disks are snapshotted. If flags contains VIR_DOMAIN_SAVE_LIVE, then the guest is resumed after snapshot is completed; otherwise the guest is halted. */ int virDomainSaveFlags(virDomainPtr dom, const char *to, const char *xml, unsigned int flags); Thoughts before I start implementing some of this for post-0.9.1? -- Eric Blake eblake@xxxxxxxxxx +1-801-349-2682 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list