On Mon, Nov 30, 2020 at 11:06:14 +0100, Michal Privoznik wrote: > Currently, we configure QEMU to prealloc memory almost by > default. Well, by default for NVDIMMs, hugepages and if user > asked us to (via memoryBacking <allocation mode="immediate"/>). > > However, there are two cases where this approach is not the best: > > 1) in case when guest's NVDIMM is backed by real life NVDIMM. In > this case users should put <pmem/> into the <memory/> device > <source/>, like this: > > <memory model='nvdimm' access='shared'> > <source> > <path>/dev/pmem0</path> > <pmem/> > </source> > </memory> > > Instructing QEMU to do prealloc in this case means that each > page of the NVDIMM is "touched" (the first byte is read and > written back - see QEMU commit v2.9.0-rc1~26^2) which cripples > device wear. > > 2) if free-page-reporting is turned on. While the > free-page-reporting feature might not have a catchy or obvious > name, when enabled it instructs KVM and subsequently QEMU to > free pages no longer used by guest resulting in smaller memory > footprint. And preallocating whole memory goes against this. > > The BZ comment 11 mentions another, third case 'virtio-mem' but > that is not implemented in libvirt, yet. > > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1894053 > Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx> > --- > src/qemu/qemu_command.c | 11 +++++++++-- > .../memory-hotplug-nvdimm-pmem.x86_64-latest.args | 2 +- > 2 files changed, 10 insertions(+), 3 deletions(-) > > diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c > index 479bcc0b0c..3df8b5ac76 100644 > --- a/src/qemu/qemu_command.c > +++ b/src/qemu/qemu_command.c > @@ -2977,7 +2977,11 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps, > if (discard == VIR_TRISTATE_BOOL_ABSENT) > discard = def->mem.discard; > > - if (def->mem.allocation == VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE) > + /* The whole point of free_page_reporting is that as soon as guest frees > + * any memory it is freed in the host too. Prealloc doesn't make much sense > + * then. */ > + if (def->mem.allocation == VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE && > + def->memballoon->free_page_reporting != VIR_TRISTATE_SWITCH_ON) > prealloc = true; IIUC def->mem.allocation is an explicit user-specified configuration, in such case we should not override the user wish based on a different configuration. If they don't make sense together technically, we should reject the config rather than silently changing it. If it's just semantically wrong and pre-existing we may leave it be. Additionally the patch is missing a test case for this scenario.