On Fri, Jun 12, 2015 at 13:29:25 -0600, Eric Blake wrote: > qemu 2.3 added a new QMP command block-set-write-threshold, > which allows callers to get an interrupt when a file hits a > write threshold, rather than the current approach of repeatedly > polling for file allocation. This patch prepares the API for > callers to register to receive the event, as well as a way > to query the threshold via virDomainListGetStats(). > > The event is one-shot in qemu - a guest must re-register a new > threshold after each time it triggers. However, the > virConnectDomainEventRegisterAny() call does not allow > parameterization, so callers must use a pair of APIs - one > to register the callback (one-time call) that will be used each > time a threshold triggers for any guest disk, and another to > repeatedly set the desired threshold (must be called each time > a threshold should be changed). > > Note that the threshold can either be registered by a byte > offset, or by a thousandth of a percentage (a value between > 0 and 100000, scaled to the disk size). But the value is > always reported as a byte offset, even when registered as a > percentage. I also considered having the setup parameter be > a double, to allow a finer resolution on percentage; with the > choice of an integer fixed-point scale, this means a 100G > disk can only set a threshold to a granularity of 1M, but > that is probably sufficient for the usage. > > To make the patch series more digestible, this patch > intentionally omits remote support, by using a couple of > placeholders at a point where the compiler forces the addition > of a case label within a switch statement. > > * include/libvirt/libvirt-domain.h > (virDomainBlockSetWriteThreshold): New API. > (virConnectDomainEventWriteThresholdCallback): New event. > * src/libvirt_public.syms (LIBVIRT_1.2.17): Export it. > * src/libvirt-domain.c (virDomainBlockSetWriteThreshold): New API. > (virConnectGetAllDomainStats): New stat. > * src/driver-hypervisor.h (virDrvDomainBlockSetWriteThreshold): > New hypervisor entry point. > * tools/virsh-domain.c (vshEventWriteThresholdPrint): Print new > event. > * tools/virsh.pod (domstats): Document new stat. > * daemon/remote.c (domainEventCallbacks): Add stub. > * src/conf/domain_event.c (virDomainEventDispatchDefaultFunc): > Likewise. > > Signed-off-by: Eric Blake <eblake@xxxxxxxxxx> > --- > daemon/remote.c | 2 + > include/libvirt/libvirt-domain.h | 47 ++++++++++++++++++++ > src/conf/domain_event.c | 4 +- > src/driver-hypervisor.h | 7 +++ > src/libvirt-domain.c | 95 ++++++++++++++++++++++++++++++++++++++++ > src/libvirt_public.syms | 5 +++ > tools/virsh-domain.c | 23 ++++++++++ > tools/virsh.pod | 1 + > 8 files changed, 183 insertions(+), 1 deletion(-) > ... > diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h > index d851225..7514656 100644 > --- a/include/libvirt/libvirt-domain.h > +++ b/include/libvirt/libvirt-domain.h > @@ -1297,6 +1297,17 @@ int virDomainBlockStatsFlags (virDomainPtr dom, > virTypedParameterPtr params, > int *nparams, > unsigned int flags); > + > +typedef enum { > + /* threshold is thousandth of a percentage (0 to 100000) relative to You managed to choose a unusual unit. Commonly used ones are 1/1000 and 1/1 000 000. Financial world also uses 1/10 000. Your unit of 1/100 000 is not among: https://en.wikipedia.org/wiki/Parts-per_notation#Parts-per_expressions I'd again suggest to use 1/1 000 000. Or if you want to be uber preciese you might choose 1/(2^64 - 1). > + * image size rather than byte limit */ > + VIR_DOMAIN_BLOCK_SET_WRITE_THRESHOLD_PERCENTAGE = (1 << 0), > +} virDomainBlockSetWriteThresholdFlags; > +int virDomainBlockSetWriteThreshold(virDomainPtr dom, > + const char *disk, > + unsigned long long threshold, > + unsigned int flags); > + > int virDomainInterfaceStats (virDomainPtr dom, > const char *path, > virDomainInterfaceStatsPtr stats, > @@ -3246,6 +3257,41 @@ typedef void (*virConnectDomainEventDeviceAddedCallback)(virConnectPtr conn, > void *opaque); > > /** > + * virConnectDomainEventWriteThresholdCallback: > + * @conn: connection object > + * @dom: domain on which the event occurred > + * @devAlias: device alias > + * @threshold: threshold that was exceeded, in bytes > + * @length: length beyond @threshold that was involved in the triggering > + * write, or 0 if not known > + * @opaque: application specified data > + * > + * The callback signature to use when registering for an event of type > + * VIR_DOMAIN_EVENT_ID_WRITE_THRESHOLD with virConnectDomainEventRegisterAny() > + * > + * This callback occurs when a block device detects a write event that > + * exceeds a non-zero threshold set by > + * virDomainBlockSetWriteThreshold(). When this event occurs, the > + * threshold is reset to 0, and a new limit must be installed to see > + * the event again on the same device. The intent of this event is to > + * allow time for the underlying storage to be resized dynamically > + * prior to the point where the guest would be paused due to running > + * out of space, without having to poll for allocation values. > + * > + * The contents of @devAlias will be "vda" when the threshold is triggered > + * on the active layer of guest disk vda. Some hypervisors also support > + * threshold reporting on backing images, such as during a block commit; > + * when that happens, @devAlias will be "vda[1]" for the backingStore at > + * index 1 within the chain of host resources for guest disk vda. Is it perhaps worth to include a optional field that will contain the file path since most use cases of this event will use a local block device with the event? iscsi and NBD block devices then could return the field empty as we now do in the bulk stats API > + */ > +typedef void (*virConnectDomainEventWriteThresholdCallback)(virConnectPtr conn, > + virDomainPtr dom, > + const char *devAlias, > + unsigned long long threshold, > + unsigned long long length, > + void *opaque); > + > +/** > * VIR_DOMAIN_TUNABLE_CPU_VCPUPIN: > * > * Macro represents formatted pinning for one vcpu specified by id which is ... > diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c > index 7e6d749..53114d3 100644 > --- a/src/libvirt-domain.c > +++ b/src/libvirt-domain.c > @@ -5743,6 +5743,99 @@ virDomainBlockStatsFlags(virDomainPtr dom, > > > /** > + * virDomainBlockSetWriteThreshold: > + * @dom: pointer to domain object > + * @disk: path to the block device, or device shorthand > + * @threshold: limit at which a write threshold event can trigger > + * @flags: bitwise-OR of virDomainBlockSetWriteThresholdFlags > + * > + * This function is used to set a one-shot write threshold. It should > + * be used in tandem with virConnectDomainEventRegisterAny() > + * installing a handler for VIR_DOMAIN_EVENT_ID_WRITE_THRESHOLD. If > + * the hypervisor detects that a write request (whether guest data, or > + * host metadata) would exceed the host byte offset specified in > + * @threshold, then an event is raised, and the threshold is reset to > + * 0 at that time. The event registration is only needed once, but > + * this function must be called each time a new threshold is desired; > + * the event will only fire if a non-zero threshold is > + * exceeded. > + * > + * By default, @threshold is specified in bytes, and must not exceed > + * the size of the block device. However, when @flags includes > + * VIR_DOMAIN_BLOCK_SET_WRITE_THRESHOLD_PERCENTAGE, @threshold is > + * instead a value between 0 an 100,000, as a thousandth of a percent > + * of the current size of the disk, and the driver will compute the > + * corresponding byte value. For example, 80000 represents 80.000%. > + * A driver may round the requested threshold to a granularity that > + * can actually be supported. > + * > + * Setting a threshold allows event-based resizing of host resources > + * that back a guest disk without having to poll the current disk > + * allocation, while still having enough time to complete the resize > + * before the guest would end up halting due to insufficient space. > + * Calling this function to set the threshold back to zero will stop > + * further firing of the event. virConnectGetAllDomainStats() can be > + * used to track the current threshold value, always in the form > + * normalized to bytes. > + * > + * The @disk parameter is either the device target shorthand (the > + * <target dev='...'/> sub-element, such as "vda"), or (since 0.9.8) Since this will be added in 1.3.0 (or 1.2.15) the "since" statement is not exactly true. > + * an unambiguous source name of the block device (the <source > + * file='...'/> sub-element, such as "/path/to/image"). Valid names > + * can be found by calling virDomainGetXMLDesc() and inspecting > + * elements within //domain/devices/disk. Some drivers might also > + * accept strings such as "vda[1]" for setting the threshold of a > + * backing image, useful when doing a block commit into the backing > + * image. > + * > + * Domains may have more than one block device. To set thresholds for > + * each you should make multiple calls to this function. If write > + * thresholds are not supported, an application will have to instead > + * poll virDomainGetBlockInfo() or similar to track allocation. > + * > + * Returns -1 in case of error, 0 in case of success. > + */ Otherwise looks good to me. Peter
Attachment:
signature.asc
Description: Digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list