On Fri, Jan 22, 2021 at 13:50:25 +0100, Michal Privoznik wrote: > The virtio-mem is paravirtualized mechanism of adding/removing > memory to/from a VM. A virtio-mem-pci device is split into blocks > of equal size which are then exposed (all or only a requested > portion of them) to the guest kernel to use as regular memory. > Therefore, the device has two important attributes: > > 1) block-size, which defines the size of a block > 2) requested-size, which defines how much memory (in bytes) > is the device requested to expose to the guest. > > The 'block-size' is configured on command line and immutable > throughout device's lifetime. The 'requested-size' can be set on > the command line too, but also is adjustable via monitor. In > fact, that is how management software places its requests to > change the memory allocation. If it wants to give more memory to > the guest it changes 'requested-size' to a bigger value, and if it > wants to shrink guest memory it changes the 'requested-size' to a > smaller value. Note, value of zero means that guest should > release all memory offered by the device. Of course, guest has to > cooperate. Therefore, there is a third attribute 'size' which is > read only and reflects how much memory the guest still has. This > can be different to 'requested-size', obviously. Because of name > clash, I've named it 'actualsize' and it is dealt with in future > commits (it is a runtime information anyway). > > In the backend, memory for virtio-mem is backed by usual objects: > memory-backend-{ram,file,memfd} and their size puts the cap on > the amount of memory that a virtio-mem device can offer to a > guest. But we are already able to express this info using <size/> > under <target/>. > > Therefore, we need only two more elements to cover 'block-size' > and 'requested-size' attributes. This is the XML I've came up > with: > > <memory model='virtio-mem'> > <source> > <nodemask>1-3</nodemask> > <pagesize unit='KiB'>2048</pagesize> > </source> > <target> > <size unit='KiB'>2097152</size> > <node>0</node> > <block unit='KiB'>2048</block> > <requested unit='KiB'>1048576</requested> > </target> > <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> > </memory> > > I hope by now it is obvious that: > > 1) 'requested-size' must be an integer multiple of > 'block-size', and > 2) virtio-mem-pci device goes onto PCI bus and thus needs PCI > address. > > Then there is a limitation that the minimal 'block-size' is > transparent huge page size (I'll leave this without explanation). > > Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx> > --- [...] > diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst > index af540391db..2938758ec2 100644 > --- a/docs/formatdomain.rst > +++ b/docs/formatdomain.rst [...] > @@ -7274,7 +7286,8 @@ Example: usage of the memory devices > Provide ``dimm`` to add a virtual DIMM module to the guest. :since:`Since > 1.2.14` Provide ``nvdimm`` model that adds a Non-Volatile DIMM module. > :since:`Since 3.2.0` Provide ``virtio-pmem`` model to add a paravirtualized > - persistent memory device. :since:`Since 7.1.0` > + persistent memory device. :since:`Since 7.1.0` Provide ``virtio-mem`` model > + to add paravirtualized memory device. :since: `Since 7.1.0` s/: `/:`/ > > ``access`` > An optional attribute ``access`` ( :since:`since 3.2.0` ) that provides [...] > @@ -7366,6 +7380,17 @@ Example: usage of the memory devices > so other backend types should use the ``readonly`` element. :since:`Since > 5.0.0` > > + ``block`` > + For ``virtio-mem`` only. > + The size of an individual block, granularity of division of memory module. > + Must be power of two and at least equal to size of a transparent hugepage > + (2MiB on x84_64). The default is hypervisor dependant. > + > + ``requested`` > + For ``virtio-mem`` only. > + The total size of blocks exposed to the guest. Must respect ``block`` > + granularity. This is a bit misleading. I'd avoid using 'blocks'. [...] > diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c > index 649fc335ac..b5a0c09468 100644 > --- a/src/conf/domain_validate.c > +++ b/src/conf/domain_validate.c > @@ -25,6 +25,7 @@ > #include "virconftypes.h" > #include "virlog.h" > #include "virutil.h" > +#include "virhostmem.h" > > #define VIR_FROM_THIS VIR_FROM_DOMAIN > > @@ -1389,6 +1390,8 @@ static int > virDomainMemoryDefValidate(const virDomainMemoryDef *mem, > const virDomainDef *def) > { > + unsigned long long thpSize; > + > switch (mem->model) { > case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: > if (!mem->nvdimmPath) { > @@ -1442,6 +1445,42 @@ virDomainMemoryDefValidate(const virDomainMemoryDef *mem, > _("virtio-pmem does not support NUMA nodes")); > return -1; > } > + break; > + > + case VIR_DOMAIN_MEMORY_MODEL_VIRTIO_MEM: > + if (mem->requestedsize > mem->size) { > + virReportError(VIR_ERR_XML_DETAIL, "%s", > + _("requested size must be smaller than @size")); > + return -1; > + } so mem->size is the maximum size of the virtio-mem provided memory? I don't think that it's obvious from the docs. > + > + if (!VIR_IS_POW2(mem->blocksize)) { > + virReportError(VIR_ERR_XML_DETAIL, "%s", > + _("block size must be a power of two")); > + return -1; > + } > + [...] > diff --git a/tests/domaincapsmock.c b/tests/domaincapsmock.c > index d81a898dc0..34a2f9ad81 100644 > --- a/tests/domaincapsmock.c > +++ b/tests/domaincapsmock.c > @@ -17,6 +17,7 @@ > #include <config.h> > > #include "virhostcpu.h" > +#include "virhostmem.h" > #ifdef WITH_LIBXL > # include "libxl/libxl_capabilities.h" > #endif > @@ -40,3 +41,11 @@ virHostCPUGetMicrocodeVersion(virArch hostArch G_GNUC_UNUSED) > { > return 0; > } > + > +int > +virHostMemGetTHPSize(unsigned long long *size) > +{ > + /* Pretend Transparent Huge Page size is 2MiB. */ > + *size = 2048; > + return 0; > +} Put the mocking along with the implementation or separately. It seems weird to implement this in a patch which adds a new device model.