virDomainGetBlockInfo meanings

Eric Blake <eblake@xxxxxxxxxx> · Mon, 15 Dec 2014 17:29:14 -0700

I'm really struggling with how to interpret the virDomainBlockInfo
struct, and think that we may want to consider the current behavior
buggy and offer improved behavior going forward.

For starters, even though the stats reported by virDomainListGetStats
are _supposed_ to mirror the older API of virDomainGetBlockInfo, they
currently do not (the newer API reads values from qemu, the older API
reads values from stat) - in order to make the two consistent, I need to
share code - but in order to share code, I need to know WHICH values
make the most sense.

Here's the current documentation:

/** virDomainBlockInfo:
 *
 * This struct provides information about the size of a block device
 * backing store
 *
 * Examples:
 *
 *  - Fully allocated raw file in filesystem:
 *       * capacity, allocation, physical: All the same
 *
 *  - Sparse raw file in filesystem:
 *       * capacity: logical size of the file
 *       * allocation, physical: number of blocks allocated to file
 *
 *  - qcow2 file in filesystem
 *       * capacity: logical size from qcow2 header
 *       * allocation, physical: logical size of the file /
 *                               highest qcow extent (identical)
 *
 *  - qcow2 file in a block device
 *       * capacity: logical size from qcow2 header
 *       * allocation: highest qcow extent written for an active domain
 *       * physical: size of the block device container
 */
typedef struct _virDomainBlockInfo virDomainBlockInfo;
typedef virDomainBlockInfo *virDomainBlockInfoPtr;
struct _virDomainBlockInfo {
    unsigned long long capacity;   /* logical size in bytes of the block
device backing image */
    unsigned long long allocation; /* highest allocated extent in bytes
of the block device backing image */
    unsigned long long physical;   /* physical size in bytes of the
container of the backing image */
};

I interpret this to mean that 'capacity' is the size that the guest
sees, that 'allocation' is how much host space is required to represent
that to the guest, and 'physical' is the size of the container on the host.

So there are at least two things that strike me as incorrect or
confusing in that documentation.

First, our usage of 'physical' in the examples _changes_ depending on
whether the file is raw or a block device.  I think that 'physical'
should represent the size of a file that 'ls' would show, no matter
what, while 'allocation' is what 'du' would show.  That is, for the
'sparse raw file in filesystem', I think we want 'capacity == physical
== size guest would see', and only 'allocation' would be smaller; our
example of 'physical == allocation' for a sparse file sounds wrong,
because 'physical' should be the size of the container (the host file)
rather than the amount of host storage used.

Second, when this struct was first introduced, it predated the ability
of the kernel to punch holes in a file.  But now qemu can do a very good
job of reclaiming space in a qcow2 file by punching holes and making the
file sparse.  So I think that 'qcow2 file in filesystem' should allow
for 'allocation < physical' - again, with the notion that 'physical' is
the size of the qcow2 file in the host that 'ls' would show, and
'allocation' is the storage occupied on the host as 'du' would show.

This would make things more consistent with the 'qcow2 file in a block
device', where 'physical' is the size of the device (well, 'ls' doesn't
show that, so you have to open/lseek the device to find it) and where
'allocation' is the amount of the device used (here, relying on qemu to
tell us the highest extent written so far, since block devices don't
have holes).

Oh, and qemu has a limitation - it only reports highest extent written
when it actually writes to an image; if a file is opened read-only (such
as a backing file), qemu reports 0 as the highest extent written, even
though that cannot be true (for a qcow2 file to exist, at least the
first cluster must have been previously written with metadata) - so it
would be nice if future qemu could be patched to determine the highest
extent when opening an image, rather than on first write to the image.
Also, it is annoying that qemu splits its reporting: 'capacity' is
reported via 'query-block' (the "actual-size" value); this command also
gives what appears to be a great 'physical' value for files (the
"actual-size" parameter) although that size is always 0 for block
devices; meanwhile, the highest extent written ('allocation' for block
devices) is reported via 'query-blockstats' (the "wr_highest_offset" value).

Can anyone give a reason why changing semantics of 'physical' and
'allocation' as mentioned above would negatively break backwards
compatibility, rather than being considered a bug fix?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment:
signature.asc

Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list