I'm really struggling with how to interpret the virDomainBlockInfo struct, and think that we may want to consider the current behavior buggy and offer improved behavior going forward. For starters, even though the stats reported by virDomainListGetStats are _supposed_ to mirror the older API of virDomainGetBlockInfo, they currently do not (the newer API reads values from qemu, the older API reads values from stat) - in order to make the two consistent, I need to share code - but in order to share code, I need to know WHICH values make the most sense. Here's the current documentation: /** virDomainBlockInfo: * * This struct provides information about the size of a block device * backing store * * Examples: * * - Fully allocated raw file in filesystem: * * capacity, allocation, physical: All the same * * - Sparse raw file in filesystem: * * capacity: logical size of the file * * allocation, physical: number of blocks allocated to file * * - qcow2 file in filesystem * * capacity: logical size from qcow2 header * * allocation, physical: logical size of the file / * highest qcow extent (identical) * * - qcow2 file in a block device * * capacity: logical size from qcow2 header * * allocation: highest qcow extent written for an active domain * * physical: size of the block device container */ typedef struct _virDomainBlockInfo virDomainBlockInfo; typedef virDomainBlockInfo *virDomainBlockInfoPtr; struct _virDomainBlockInfo { unsigned long long capacity; /* logical size in bytes of the block device backing image */ unsigned long long allocation; /* highest allocated extent in bytes of the block device backing image */ unsigned long long physical; /* physical size in bytes of the container of the backing image */ }; I interpret this to mean that 'capacity' is the size that the guest sees, that 'allocation' is how much host space is required to represent that to the guest, and 'physical' is the size of the container on the host. So there are at least two things that strike me as incorrect or confusing in that documentation. First, our usage of 'physical' in the examples _changes_ depending on whether the file is raw or a block device. I think that 'physical' should represent the size of a file that 'ls' would show, no matter what, while 'allocation' is what 'du' would show. That is, for the 'sparse raw file in filesystem', I think we want 'capacity == physical == size guest would see', and only 'allocation' would be smaller; our example of 'physical == allocation' for a sparse file sounds wrong, because 'physical' should be the size of the container (the host file) rather than the amount of host storage used. Second, when this struct was first introduced, it predated the ability of the kernel to punch holes in a file. But now qemu can do a very good job of reclaiming space in a qcow2 file by punching holes and making the file sparse. So I think that 'qcow2 file in filesystem' should allow for 'allocation < physical' - again, with the notion that 'physical' is the size of the qcow2 file in the host that 'ls' would show, and 'allocation' is the storage occupied on the host as 'du' would show. This would make things more consistent with the 'qcow2 file in a block device', where 'physical' is the size of the device (well, 'ls' doesn't show that, so you have to open/lseek the device to find it) and where 'allocation' is the amount of the device used (here, relying on qemu to tell us the highest extent written so far, since block devices don't have holes). Oh, and qemu has a limitation - it only reports highest extent written when it actually writes to an image; if a file is opened read-only (such as a backing file), qemu reports 0 as the highest extent written, even though that cannot be true (for a qcow2 file to exist, at least the first cluster must have been previously written with metadata) - so it would be nice if future qemu could be patched to determine the highest extent when opening an image, rather than on first write to the image. Also, it is annoying that qemu splits its reporting: 'capacity' is reported via 'query-block' (the "actual-size" value); this command also gives what appears to be a great 'physical' value for files (the "actual-size" parameter) although that size is always 0 for block devices; meanwhile, the highest extent written ('allocation' for block devices) is reported via 'query-blockstats' (the "wr_highest_offset" value). Can anyone give a reason why changing semantics of 'physical' and 'allocation' as mentioned above would negatively break backwards compatibility, rather than being considered a bug fix? -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list