On Mon, May 02, 2011 at 04:29:49PM -0500, Adam Litke wrote: > After several long distractions, I am back to working on disk streaming. > Before I hit the list with a new series of patches, I was hoping to > reach some middle ground on the proposed streaming API. > > On Fri, 2011-04-08 at 14:31 +0100, Daniel P. Berrange wrote: > > On Thu, Apr 07, 2011 at 04:31:59PM -0500, Adam Litke wrote: > > > /* > > > + * Disk Streaming > > > + */ > > > +typedef enum { > > > + VIR_STREAM_DISK_ONE = 1, /* Stream a single disk unit */ > > > + VIR_STREAM_DISK_START = 2, /* Stream the entire disk */ > > > + VIR_STREAM_DISK_STOP = 4, /* Stop streaming a disk */ > > > +} virDomainStreamDiskFlags; > > > > Using flags to combine two separate tasks into one single API > > is rather unpleasant. As raised in the previous patch, the API > > should also be taking a offset+length in bytes, then there is > > no need for a special case transfer of an individual sector. > > This is a fair point. I will work with Stefan to support an > offset/length qemu API. Since there doesn't seem to be a great way to > query device sizes, I think we do need a convenient way to request that > the entire disk be streamed. We could either do this with a flag or by > overriding (offset==0 && length==0) to mean stream the entire device. > Do you have a preference? Since length==0 is otherwise meaningless, it is fine to use that to indicate "until end of disk". This is consistent with what we do for virStorageVolUpload/Download which allow length=0 to indicate "until end of disk". > > > +#define VIR_STREAM_PATH_BUFLEN 1024 > > > +#define VIR_STREAM_DISK_MAX_STREAMS 10 > > > + > > > +typedef struct _virStreamDiskState virStreamDiskState; > > > +struct _virStreamDiskState { > > > + char path[VIR_STREAM_PATH_BUFLEN]; > > > + /* > > > + * The unit of measure for size and offset is unspecified. These fields > > > + * are meant to indicate the progress of a continuous streaming operation. > > > + */ > > > + unsigned long long offset; /* Current offset of active streaming */ > > > + unsigned long long size; /* Disk size */ > > > +}; > > > +typedef virStreamDiskState *virStreamDiskStatePtr; > > > + > > > +unsigned long long virDomainStreamDisk(virDomainPtr dom, > > > + const char *path, > > > + unsigned long long offset, > > > + unsigned int flags); > > > + > > > +int virDomainStreamDiskInfo(virDomainPtr dom, > > > + virStreamDiskStatePtr states, > > > + unsigned int nr_states, > > > + unsigned int flags); > > > > I would have liked it if we could use the existing JobInfo APIs for > > getting progress information, but if we need to allow concurrent > > usage for multiple disks per-VM, we can't. I think we should still > > use a similar style of API though. > > The goal is to eventually support concurrent streams. Therefore, we > will probably need to roll our own Status and Abort APIs (see my > proposed API below). > > > There also doesn't appear to be a precise way to determine if the > > copying of an entire disk failed part way through, and if so, how > > much was actually copied. > > > > Taking all the previous points together, I think the API needs to > > look like this: > > > > typedef enum { > > /* If set, virDomainBlockAllocate() will return immediately > > * allowing polling for operation completion & status > > */ > > VIR_DOMAIN_DISK_ALLOCATE_NONBLOCK, > > } virDomainBlockAllocateFlags; > > > > /* > > * @path: fully qualified filename of the virtual disk > > I probably misnamed it, but path is actually the device alias, not a > path to an image file. Hmm, I wonder whether that is a good choice or not. The other APIs all use the disk path. Perhaps we could use that as default and have a flag to indicate whether the path should be treated as a device alias instead. Thus getting both options. > > > * @offset: logical position in bytes, within the virtual disk > > * @length: amount of data to copy, in bytes starting from @offset > > * @flags: One of virDomainBlockAllocateFlags, or zero > > * > > * Ensure the virtual disk @path is fully allocated > > * between @offset and @offset+@length. If a backing > > * store is present, data will be filled from the > > * backing store, otherwise data will be fileld with > > * zeros > > * > > * If @flags contains VIR_DOMAIN_DISK_ALLOCATE_NONBLOCK, > > * this API will return immediately after initiating the > > * copy, otherwise it will block until copying is complete > > * > > */ > > int virDomainBlockAllocate(virDomainPtr dom, > > const char *path, > > unsigned long long offset, > > unsigned long long length, > > unsigned int flags); > > > > > > /* > > * @path: fully qualified filename of the virtual disk > > * @info: allocated struct to return progress info > > * > > * Query the progress of a disk allocation job. This > > * API must be used when virDomainBlockAllocate() was > > * invoked with the VIR_DOMAIN_DISK_ALLOCATE_NONBLOCK > > * flag set. > > * > > * The @info.type field will indicate whether the job > > * was completed successfully, or failed part way > > * through. > > * > > * The @info data progress fields will contain current > > * progress information. > > * > > * The hypervisor driver may optionally chose to also > > * fillin a time estimate for completion. > > */ > > int virDomainBlockGetJobInfo(virDomainPtr dom, > > const char *path, > > virDomainJobInfoPtr info); > > > > /* > > * @path: fully qualified filename of the virtual disk > > * > > * Request that a disk allocation job be aborted at > > * the soonest opportunity. This API can only be used > > * when virDomainBlockAllocate() was invoked with the > > * VIR_DOMAIN_DISK_ALLOCATE_NONBLOCK flag set. > > */ > > int virDomainBlockAbortJob(virDomainPtr dom, > > const char *path); > > > > > > typedef struct _virDomainBlockRegion virDomainBlockRegion; > > typedef virDomainBlockRegion *virDomainBlockRegionPtr; > > struct _virDomainBlockRegion { > > /* The logical offset within the file of the allocated region */ > > unsigned long long offset; > > /* The length of the allocated region */ > > unsigned long long length; > > }; > > > > /* > > * @path: fully qualified filename of the virtual disk > > * @nregions: filled in the number of @region structs > > * @regions: filled with a list of allocated regions > > * > > * Query the extents of allocated regions within the > > * virtual disk file. The offsets in the list of regions > > * are not guarenteed to be sorted in any explicit order. > > */ > > int virDomainBlockGetAllocationMap(virDomainPtr dom, > > const char *path, > > unsigned int *nregions, > > virDomainBlockRegionPtr *regions); > > I am not convinced that we really need the block allocation map stuff. > It's a fun toy, but the layout of data within a device is far too > low-level of a concept to be exposing to users. In my opinion, > streaming should really be done sequentially from the start of the > device to the end (with arbitrary chunk sizes allowed for throttling > purposes). If an application wants to stream according to some special > pattern, let them maintain a data structure outside of libvirt to manage > that extra complexity. It's not an error to stream a chunk that is > already present. The populated areas will just complete immediately. > Therefore, there isn't a need to maintain a strict record of outstanding > regions. Well we can always add something like this in at a later date, so it is fine to drop it. > > This takes care of things for running guests. It would be > > desirable to have the same functionality available when a > > guest is not running, via the virStorageVol APIs. Indeed, > > this would allow access to the allocation functionality > > for disks not explicitly associated with any VM yet. > > In light of previous discussion about the complexities around storage > formats and filesystems, I am not sure how useful such an offline API is > going to be in practice. At any rate, I would prefer to consider that > issue separately. > > So, with my points taken into account I would like to counter with the > following API proposal: > > ==== snip ==== > > typedef enum { > /* If set, virDomainBlockAllocate() will return immediately > * allowing polling for operation completion & status > */ > VIR_DOMAIN_DISK_STREAM_NONBLOCK, > } virDomainBlockStreamFlags; > > /* > * @device: alias of the target block device > * @offset: logical position in bytes, within the virtual disk > * @length: amount of data to copy, in bytes starting from @offset > * @flags: One of virDomainBlockAllocateFlags, or zero > * > * Ensure the virtual disk @device is fully allocated > * between @offset and @offset+@length. If a backing > * store is present, data will be filled from the > * backing store, otherwise data will be fileld with > * zeros. > * > * If @flags contains VIR_DOMAIN_DISK_STREAM_NONBLOCK, > * this API will return immediately after initiating the > * copy, otherwise it will block until copying is complete > * > */ > int virDomainBlockStream(virDomainPtr dom, > const char *device, > unsigned long long offset, > unsigned long long length, > unsigned int flags); > > /* > * @device: alias of the target block device > * > * Request that a disk streaming job be aborted at > * the soonest opportunity. This API can only be used > * when virDomainBlockStream() was invoked with the > * VIR_DOMAIN_DISK_STREAM_NONBLOCK flag set. > */ > int virDomainBlockStreamAbort(virDomainPtr dom, > const char *device); > > typedef enum { > VIR_BLOCK_STREAM_STATE_COMPLETED = 0, > VIR_BLOCK_STREAM_STATE_ACTIVE = 1, > VIR_BLOCK_STREAM_STATE_FAILED = 2, > } virBlockStreamStatus; > > typedef struct _virBlockStreamInfo virBlockStreamInfo; > struct _virBlockStreamInfo { > virBlockStreamStatus status; Coding guidelines don't allow use of enum types in structs or as API parameters, instead requiring 'int'. > unsigned long long remaining; /* number of bytes remaining */ > }; > typedef virBlockStreamInfo *virBlockStreamInfoPtr; s/virBlock/virDomainBlock/ in several places here. > > /* > * @device: alias of the target block device > * @info: allocated struct to return progress info > * > * Query the progress of a disk streaming job. This > * API must be used when virDomainBlockStream() was > * invoked with the VIR_DOMAIN_DISK_STREAM_NONBLOCK > * flag set. > * > */ > int virDomainBlockStreamInfo(virDomainPtr dom, > const char *device, > virBlockStreamInfoPtr info); Any reason you didn't just use the existing 'virDomainJobInfoPtr' struct, with this new API ? In particular I think we want many of the other fields from that struct beyond just 'remaining'. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list