On 09/17/2018 09:14 AM, marcandre.lureau@xxxxxxxxxx wrote: > From: Marc-André Lureau <marcandre.lureau@xxxxxxxxxx> > > Add a new memoryBacking source type "memfd", supported by QEMU (when > the apability is available). *capability > > A memfd is a specialized anonymous memory kind. As such, an anonymous > source type could be automatically using a memfd. However, there are > some complications when migrating from different memory backends in > qemu (mainly due to the internal object naming at this point, but > there could be more). For now, it is simpler and safer to simply > introduce a new source type "memfd". Eventually, the "anonymous" type > could learn to use memfd transparently in a seperate change. *separate > > The main benefits are that it doesn't need to create filesystem files, > and it also enforces sealing, providing a bit more safety. > > Signed-off-by: Marc-André Lureau <marcandre.lureau@xxxxxxxxxx> > --- > docs/formatdomain.html.in | 9 +-- > docs/schemas/domaincommon.rng | 1 + > src/conf/domain_conf.c | 3 +- > src/conf/domain_conf.h | 1 + > src/qemu/qemu_command.c | 69 +++++++++++++------ > src/qemu/qemu_domain.c | 12 +++- > .../memfd-memory-numa.x86_64-latest.args | 34 +++++++++ > tests/qemuxml2argvdata/memfd-memory-numa.xml | 36 ++++++++++ > tests/qemuxml2argvtest.c | 2 + > 9 files changed, 140 insertions(+), 27 deletions(-) > create mode 100644 tests/qemuxml2argvdata/memfd-memory-numa.x86_64-latest.args > create mode 100644 tests/qemuxml2argvdata/memfd-memory-numa.xml > More recently I've been trying to enforce separating XML/conf/rng/docs changes from qemu/args changes... This makes review and testing a bit easier and more "restricted". Since I didn't make it clear previously and I can split things up - no problem. I'll also be adding a "qemuxml2xmltest" for the input file to "prove" it generates the output. It'll of course need to add the QEMU_CAPS_OBJECT_MEMORY_MEMFD_HUGETLB to the DO_TEST. Adding xml2xmltest is something required when we add new attributes or input options. I'll split the commit message appropriately too. BTW: I think if "someone" follows this up with moving the qemu_command logic into a new qemuDomainPrepare* method, then I think we can separate the "new" or "fresh" start from the migration start and thus might be able to generate a mechanism that would use memfd for anonymous with the right capabilities present. Not sure it'll fly, but it may be worth a shot. It's getting more and more painful to be stuck with "old stuff". > diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in > index 1f12ab5b42..eeee1f6d40 100644 > --- a/docs/formatdomain.html.in > +++ b/docs/formatdomain.html.in > @@ -1099,7 +1099,7 @@ > </hugepages> > <nosharepages/> > <locked/> > - <source type="file|anonymous"/> > + <source type="file|anonymous|memfd"/> > <access mode="shared|private"/> > <allocation mode="immediate|ondemand"/> > <discard/> > @@ -1150,9 +1150,10 @@ > suitable for the specific environment at the same time to mitigate > the risks described above. <span class="since">Since 1.0.6</span></dd> > <dt><code>source</code></dt> > - <dd>Using the <code>type</code> attribute, it's possible to provide > - "file" to utilize file memorybacking or keep the default > - "anonymous".</dd> > + <dd>Using the <code>type</code> attribute, it's possible to > + provide "file" to utilize file memorybacking or keep the > + default "anonymous". <span class="since">Since 4.8.0</span>, > + you may choose "memfd" backing. (QEMU/KVM only)</dd> Need to keep format consistent, I'll adjust. > <dt><code>access</code></dt> > <dd>Using the <code>mode</code> attribute, specify if the memory is > to be "shared" or "private". This can be overridden per numa node by > diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng > index 099a949cf8..4b431b4188 100644 > --- a/docs/schemas/domaincommon.rng > +++ b/docs/schemas/domaincommon.rng > @@ -655,6 +655,7 @@ > <choice> > <value>file</value> > <value>anonymous</value> > + <value>memfd</value> > </choice> > </attribute> > </element> > diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c > index 1ee43950ae..648015b5b5 100644 > --- a/src/conf/domain_conf.c > +++ b/src/conf/domain_conf.c > @@ -894,7 +894,8 @@ VIR_ENUM_IMPL(virDomainDiskMirrorState, VIR_DOMAIN_DISK_MIRROR_STATE_LAST, > VIR_ENUM_IMPL(virDomainMemorySource, VIR_DOMAIN_MEMORY_SOURCE_LAST, > "none", > "file", > - "anonymous") > + "anonymous", > + "memfd") syntax-check would tell you thou shalt not use tabs > > VIR_ENUM_IMPL(virDomainMemoryAllocation, VIR_DOMAIN_MEMORY_ALLOCATION_LAST, > "none", [...] > diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c > index 2fd8a2a268..4983669a34 100644 > --- a/src/qemu/qemu_domain.c > +++ b/src/qemu/qemu_domain.c > @@ -3949,7 +3949,8 @@ qemuDomainDefValidateFeatures(const virDomainDef *def, > > > static int > -qemuDomainDefValidateMemory(const virDomainDef *def) > +qemuDomainDefValidateMemory(const virDomainDef *def, > + virQEMUCapsPtr qemuCaps) > { > const long system_page_size = virGetSystemPageSizeKB(); > const virDomainMemtune *mem = &def->mem; > @@ -3971,6 +3972,13 @@ qemuDomainDefValidateMemory(const virDomainDef *def) > return -1; > } > > + if (mem->source == VIR_DOMAIN_MEMORY_SOURCE_MEMFD && > + !virQEMUCapsGet(qemuCaps, QEMU_CAPS_OBJECT_MEMORY_MEMFD_HUGETLB)) { > + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", > + _("hugepages is not support with memfd memory source")); _("hugepages are not supported using memfd memory " "source with this version of QEMU")); > + return -1; > + } > + > /* We can't guarantee any other mem.access > * if no guest NUMA nodes are defined. */ > if (mem->hugepages[0].size != system_page_size && > @@ -4110,7 +4118,7 @@ qemuDomainDefValidate(const virDomainDef *def, > if (qemuDomainDefValidateFeatures(def, qemuCaps) < 0) > goto cleanup; > > - if (qemuDomainDefValidateMemory(def) < 0) > + if (qemuDomainDefValidateMemory(def, qemuCaps) < 0) > goto cleanup; > > ret = 0; [...] > diff --git a/tests/qemuxml2argvdata/memfd-memory-numa.xml b/tests/qemuxml2argvdata/memfd-memory-numa.xml > new file mode 100644 > index 0000000000..8416a990fa > --- /dev/null > +++ b/tests/qemuxml2argvdata/memfd-memory-numa.xml I don't recall from the original change, but each of the lines is prefixed by 2 extra spaces... I'll fix before pushing. I can fixup the nits noted. I'll wait until tomorrow before pushing so that if Michal or Pavel wish to comment they can... Reviewed-by: John Ferlan <jferlan@xxxxxxxxxx> John > @@ -0,0 +1,36 @@ > + <domain type='kvm' id='56'> > + <name>instance-00000092</name> > + <uuid>126f2720-6f8e-45ab-a886-ec9277079a67</uuid> > + <memory unit='KiB'>14680064</memory> > + <currentMemory unit='KiB'>14680064</currentMemory> > + <memoryBacking> > + <hugepages> > + <page size="2" unit="M"/> > + </hugepages> > + <source type='memfd'/> > + <access mode='shared'/> > + <allocation mode='immediate'/> > + </memoryBacking> > + <numatune> > + <memnode cellid='0' mode='preferred' nodeset='3'/> > + </numatune> > + <vcpu placement='static'>8</vcpu> > + <os> > + <type arch='x86_64' machine='pc-i440fx-wily'>hvm</type> > + <boot dev='hd'/> > + </os> > + <cpu> > + <topology sockets='1' cores='8' threads='1'/> > + <numa> > + <cell id='0' cpus='0-7' memory='14680064' unit='KiB'/> > + </numa> > + </cpu> > + <clock offset='utc'/> > + <on_poweroff>destroy</on_poweroff> > + <on_reboot>restart</on_reboot> > + <on_crash>destroy</on_crash> > + <devices> > + <emulator>/usr/bin/qemu-system-x86_64</emulator> > + <memballoon model='virtio'/> > + </devices> > + </domain> [...] -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list