On 01/30/2012 09:00 AM, Jiri Denemark wrote: > We already provide ways to detect when a domain has been paused as a > result of I/O error, but there was no way of getting the exact error or > even the device that experienced it. This new API may be used for both. > --- > daemon/remote.c | 40 ++++++++++++++++++++++++++++++++ > include/libvirt/libvirt.h.in | 32 +++++++++++++++++++++++++ > python/generator.py | 3 +- > src/driver.h | 7 +++++ > src/libvirt.c | 52 ++++++++++++++++++++++++++++++++++++++++++ > src/libvirt_public.syms | 1 + > src/remote/remote_driver.c | 34 +++++++++++++++++++++++++++ > src/remote/remote_protocol.x | 22 +++++++++++++++++- > src/remote_protocol-structs | 16 +++++++++++++ > src/rpc/gendispatch.pl | 47 +++++++++++++++++++++++++++++++++++++ > 10 files changed, 252 insertions(+), 2 deletions(-) This is big enough that I might have split it into public API (include, python, src/{driver.h,libvirt.c,libvirt_public.syms}) and RPC implementation (daemon, src/remote, src/rpc). But I'll go ahead and review this, whether or not you split it. > +++ b/include/libvirt/libvirt.h.in > @@ -1967,6 +1967,38 @@ virDomainGetBlockIoTune(virDomainPtr dom, > int *nparams, > unsigned int flags); > > +/** > + * virDomainDiskErrorCode: > + * > + * Disk I/O error. > + */ > +typedef enum { > + VIR_DOMAIN_DISK_ERROR_NONE = 0, /* no error */ > + VIR_DOMAIN_DISK_ERROR_UNSPEC = 1, /* unspecified I/O error */ > + VIR_DOMAIN_DISK_ERROR_NO_SPACE = 2, /* no space left on the device */ > + > +#ifdef VIR_ENUM_SENTINELS > + VIR_DOMAIN_DISK_ERROR_LAST > +#endif > +} virDomainDiskErrorCode; Looks reasonable and extensible. > + > +/** > + * virDomainDiskError: > + * > + */ > +typedef struct _virDomainDiskError virDomainDiskError; > +typedef virDomainDiskError *virDomainDiskErrorPtr; > + > +struct _virDomainDiskError { > + char *disk; /* disk target */ > + int error; /* virDomainDiskErrorCode */ > +}; > + > +int virDomainGetDiskErrors(virDomainPtr dom, > + virDomainDiskErrorPtr errors, > + int maxerrors, I guess inertia is on our side for using int maxerrors, even though the value should never be negative. > +++ b/python/generator.py > @@ -423,7 +423,8 @@ skip_impl = ( > 'virDomainGetBlockIoTune', > 'virDomainSetInterfaceParameters', > 'virDomainGetInterfaceParameters', > - 'virDomainGetCPUStats' # not implemented now. > + 'virDomainGetCPUStats', # not implemented now. > + 'virDomainGetDiskErrors', Oops - blame me for not using a trailing comma, so that your hunk had to touch more lines than one :) > +/** > + * virDomainGetDiskErrors: > + * @dom: a domain object > + * @errors: array to populate on output > + * @maxerrors: size of @errors array > + * @flags: extra flags; not used yet, so callers should always pass 0 > + * > + * The function populates @errors array with all disks that encountered an > + * I/O error. Each disk is identified by its target (the dev attribute of > + * target subelement in domain XML), such as "vda", and accompanied with > + * the error that was seen on it. Suppose I have a domain with 3 disks, and disk A and C have both encountered an error. It is not clear to me whether calling virDomainGetDiskErrors(dom, array, 3, 0) will return 2 (info on A and C, since those were the only ones with errors) or 3 (info on A, B, and C, with B explicitly stating that there were no errors). I guess virDomainDiskErrorPtr has a reserved value for no error, so I'm assuming the latter, but I'm wondering how best to word this. Maybe: s~array with all disks that encountered an I/O error~array with information on all disks, and whether they have encountered an I/O error~ Or, if you really did mean to return only the disks with errors while omitting working disks, then we should add a shortcut, where calling with errors==NULL and maxerrors==0 returns how many errors would be returned, in order for the caller to know how big to allocate things. For that matter, even if we return no error, it would _still_ be handy to support the special case as a way to count how many disks are being tracked for errors. > + * Returns number of disks with errors filled in the @errors array or -1 on > + * error. Please also document that the caller is responsible for calling free() on each disk name returned. > +int > +virDomainGetDiskErrors(virDomainPtr dom, > + virDomainDiskErrorPtr errors, > + int maxerrors, > + unsigned int flags) > +{ > + VIR_DOMAIN_DEBUG(dom, "errors=%p, maxerrors=%d, flags=%x", > + errors, maxerrors, flags); > + > + virResetLastError(); > + > + if (!VIR_IS_DOMAIN(dom)) { > + virLibDomainError(VIR_ERR_INVALID_DOMAIN, __FUNCTION__); > + virDispatchError(NULL); > + return -1; > + } > + > + if (dom->conn->flags & VIR_CONNECT_RO) { > + virLibDomainError(VIR_ERR_OPERATION_DENIED, __FUNCTION__); > + goto error; > + } Missing a sanity check that errors is non-NULL and maxerrors is > 0. It's a common check, so it should be here rather than making each driver repeat it, and we have precedence for that sort of sanity check in this file. > +struct remote_domain_get_disk_errors_ret { > + remote_domain_disk_error errors<REMOTE_DOMAIN_DISK_ERRORS_MAX>; /* insert@1 */ > +}; If we implement the shorthand of virDomainGetDiskErrors(dom, NULL, 0, 0) as a way to tell how big to allocate an array for the next call, then this struct also needs to have a return length independent from the array. Quite a few of the virTypedParameters functions should serve as a model. > + > > /*----- Protocol. -----*/ > > @@ -2708,7 +2727,8 @@ enum remote_procedure { > REMOTE_PROC_STORAGE_VOL_RESIZE = 260, /* autogen autogen */ > > REMOTE_PROC_DOMAIN_PM_SUSPEND_FOR_DURATION = 261, /* autogen autogen */ > - REMOTE_PROC_DOMAIN_GET_CPU_STATS = 262 /* skipgen skipgen */ > + REMOTE_PROC_DOMAIN_GET_CPU_STATS = 262, /* skipgen skipgen */ > + REMOTE_PROC_DOMAIN_GET_DISK_ERRORS = 263 /* autogen autogen */ Nice that you figured out how to autogen the serialize/deserialize calls. We should probably do the same for a lot of the virTypedParameter clients someday (then again, I think the reason the virTypedParameter clients don't autogen is because we don't yet have the shorthand for determining allocation limits wired up in the generator, so I may have just broken that for you by requesting the shorthand). Since adding a shorthand for returning the allocation size would imply an ABI break for RPC, we need to do it now, before the 0.9.10 freeze, so I think it's best to post a v3. -- Eric Blake eblake@xxxxxxxxxx +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list