On 08/11/2016 09:26 AM, Michal Privoznik wrote: > NVDIMM is new type of memory introduced in qemu. The idea is that s/in qemu/into QEMU 2.6/ > we have a DIMM module that keeps the data persistent across s/DIMM/Non Volatile memory/ > domain reboots. Perhaps a word of two about what is the usefulness of such a beast. I think (from a former project) one usage is to store parameters for firmware such as OVMF Add extra line between paragraphs... > At the domain XML level, we already have some representation of > 'dimm' modules. Long story short, we have <memory/> element that > lives under <devices/>. Now, the element even has @model > attribute which we can use to introduce new memory type: > > <memory model='nvdimm'> > <source> > <path>/tmp/nvdimm</path> > </source> > <target> > <size unit='KiB'>523264</size> > <node>0</node> > </target> > </memory> > > So far, this is just a XML parser/formatter extension. QEMU > driver implementation is in the next commit. > > For more info on NVDIMM visit the following web page: > > http://pmem.io/ > One could also google it ;-)... In any case, the actual details are found the Documents link from that page, subject to move over time of course. > Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx> > --- > docs/formatdomain.html.in | 26 ++++-- > docs/schemas/domaincommon.rng | 32 ++++--- > src/conf/domain_conf.c | 97 ++++++++++++++++------ > src/conf/domain_conf.h | 2 + > src/qemu/qemu_command.c | 6 ++ > src/qemu/qemu_domain.c | 1 + > .../qemuxml2argv-memory-hotplug-nvdimm.xml | 49 +++++++++++ > 7 files changed, 171 insertions(+), 42 deletions(-) > create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nvdimm.xml > > diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in > index bfbb0f2..657df8f 100644 > --- a/docs/formatdomain.html.in > +++ b/docs/formatdomain.html.in > @@ -6740,6 +6740,15 @@ qemu-kvm -net nic,model=? /dev/null > <node>1</node> > </target> > </memory> > + <memory model='nvdimm'> > + <source> > + <path>/tmp/nvdimm</path> > + </source> > + <target> > + <size unit='KiB'>524287</size> > + <node>1</node> > + </target> > + </memory> > </devices> > ... > </pre> > @@ -6747,17 +6756,19 @@ qemu-kvm -net nic,model=? /dev/null > <dt><code>model</code></dt> > <dd> > <p> > - Currently only the <code>dimm</code> model is supported in order to > - add a virtual DIMM module to the guest. > + Select <code>dimm</code> to add a virtual DIMM module to the guest. Ugh... The 'dimm' Since tag is in the "Memory devices" section... Which just make the following awkward. I think you should grab that 1.2.14 from above and place it here. Rather than Select go with "Use" or "Provide" > + Alternatively, <code>nvdimm</code> model adds a Non-Volatile DIMM Use the same format - e.g. "Use/Provide <code>nvdimm</code> to add a Non-Volatile DIMM module." > + module. <span class="since">Since 2.2.0</span> Well we won't hit 2.2.0.... > </p> > </dd> > > <dt><code>source</code></dt> > <dd> > <p> > - The optional source element allows to fine tune the source of the > - memory used for the given memory device. If the element is not > - provided defaults configured via <code>numatune</code> are used. > + For model <code>dimm</code> this element is optional and allows to > + fine tune the source of the memory used for the given memory device. > + If the element is not provided defaults configured via > + <code>numatune</code> are used. > </p> > <p> > <code>pagesize</code> can optionally be used to override the default > @@ -6770,6 +6781,11 @@ qemu-kvm -net nic,model=? /dev/null > <code>nodemask</code> can optionally be used to override the default > set of NUMA nodes where the memory would be allocated. > </p> Since pagesize and nodemask are for DIMM only, so they probably need to be converted to some sort of list or in some way indented to ensure the visual cue is there that they are meant for dimm. Perhaps the end of the dimm paragraph needs a "If <code>source</code> is provided, then at least one of the following values would be provided:". I think the only way to get the right formatting look is : <dd> <dt><code>pagesize</code></dt> <p> ... </p> <dt><code>nodemask</code></dt> <p> ... </p> </dd> > + <p> > + For model <code>nvdimm</code> this element is mandatory and has a > + single child element <code>path</code> which value represents a path s/which value/that/ ? > + in host that back the nvdimm module in the guest. s/in host that back/in the host that backs/ Should there be any mention that this also requires "<maxMemory slots='#'...>" to be set? So something that isn't quite clear... For 'dimm', these are 'extra' memory, while for 'nvdimm 'it seems to be one hunk that's really not extra memory - rather it's memory that can be used for a specific purpose. What's not clear to me - is the existing "physical" memory in the guest (e.g. the numa node page) is used to map to the file or does this appear as extra memory that is only accessible for this purpose? Also, does the numa node have to be properly sized? Your example shows a 214Mb numa cell and a 511MB nvdimm. Doesn't seem to me that fits quite right. BTW: That video you point to in your v1 indicates two types of NVDIMM - "persistent memory namespace" (accessed using loads/stores, mapped to physical memory) and "block mode namespace" (accessed via block operations, atomicity at block level, indirect mapping thru a block window, no mapping entier memory. So which is KVM? Since libvirt must be the master to more than one driver - should there be some sort of subtype="block|persistent" as well? Or we could just indicate KVM only for now... > + </p> > </dd> > > <dt><code>target</code></dt> > diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng > index 052f28c..e6ad452 100644 > --- a/docs/schemas/domaincommon.rng > +++ b/docs/schemas/domaincommon.rng > @@ -4722,6 +4722,7 @@ > <attribute name="model"> > <choice> > <value>dimm</value> > + <value>nvdimm</value> > </choice> > </attribute> > <interleave> > @@ -4741,18 +4742,27 @@ > > <define name="memorydev-source"> > <element name="source"> > - <interleave> > - <optional> > - <element name="pagesize"> > - <ref name="scaledInteger"/> > + <choice> > + <group> > + <interleave> > + <optional> > + <element name="pagesize"> > + <ref name="scaledInteger"/> > + </element> > + </optional> > + <optional> > + <element name="nodemask"> > + <ref name="cpuset"/> > + </element> > + </optional> > + </interleave> > + </group> > + <group> > + <element name="path"> > + <text/> > </element> > - </optional> > - <optional> > - <element name="nodemask"> > - <ref name="cpuset"/> > - </element> > - </optional> > - </interleave> > + </group> > + </choice> > </element> > </define> > > diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c > index 82876f3..cb5a053 100644 > --- a/src/conf/domain_conf.c > +++ b/src/conf/domain_conf.c > @@ -840,8 +840,11 @@ VIR_ENUM_DECL(virDomainBlockJob) > VIR_ENUM_IMPL(virDomainBlockJob, VIR_DOMAIN_BLOCK_JOB_TYPE_LAST, > "", "", "copy", "", "active-commit") > > -VIR_ENUM_IMPL(virDomainMemoryModel, VIR_DOMAIN_MEMORY_MODEL_LAST, > - "", "dimm") > +VIR_ENUM_IMPL(virDomainMemoryModel, > + VIR_DOMAIN_MEMORY_MODEL_LAST, > + "", > + "dimm", > + "nvdimm") > > static virClassPtr virDomainObjClass; > static virClassPtr virDomainXMLOptionClass; > @@ -2345,6 +2348,7 @@ void virDomainMemoryDefFree(virDomainMemoryDefPtr def) > if (!def) > return; > > + VIR_FREE(def->path); > virBitmapFree(def->sourceNodes); > virDomainDeviceInfoClear(&def->info); > VIR_FREE(def); > @@ -13140,20 +13144,36 @@ virDomainMemorySourceDefParseXML(xmlNodePtr node, > xmlNodePtr save = ctxt->node; > ctxt->node = node; > > - if (virDomainParseMemory("./pagesize", "./pagesize/@unit", ctxt, > - &def->pagesize, false, false) < 0) > - goto cleanup; > - > - if ((nodemask = virXPathString("string(./nodemask)", ctxt))) { > - if (virBitmapParse(nodemask, &def->sourceNodes, > - VIR_DOMAIN_CPUMASK_LEN) < 0) > + switch ((virDomainMemoryModel) def->model) { > + case VIR_DOMAIN_MEMORY_MODEL_DIMM: > + if (virDomainParseMemory("./pagesize", "./pagesize/@unit", ctxt, > + &def->pagesize, false, false) < 0) > goto cleanup; > > - if (virBitmapIsAllClear(def->sourceNodes)) { > - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, > - _("Invalid value of 'nodemask': %s"), nodemask); > + if ((nodemask = virXPathString("string(./nodemask)", ctxt))) { > + if (virBitmapParse(nodemask, &def->sourceNodes, > + VIR_DOMAIN_CPUMASK_LEN) < 0) > + goto cleanup; > + > + if (virBitmapIsAllClear(def->sourceNodes)) { > + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, > + _("Invalid value of 'nodemask': %s"), nodemask); > + goto cleanup; > + } > + } > + break; > + > + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: > + if (!(def->path = virXPathString("string(./path)", ctxt))) { > + virReportError(VIR_ERR_XML_DETAIL, "%s", > + _("path is required for model nvdimm'")); > goto cleanup; > } > + break; > + > + case VIR_DOMAIN_MEMORY_MODEL_NONE: > + case VIR_DOMAIN_MEMORY_MODEL_LAST: > + break; > } > > ret = 0; > @@ -14541,12 +14561,25 @@ virDomainMemoryFindByDefInternal(virDomainDefPtr def, > tmp->size != mem->size) > continue; > > - /* source stuff -> match with device */ > - if (tmp->pagesize != mem->pagesize) > - continue; > + switch ((virDomainMemoryModel) mem->model) { > + case VIR_DOMAIN_MEMORY_MODEL_DIMM: > + /* source stuff -> match with device */ > + if (tmp->pagesize != mem->pagesize) > + continue; > > - if (!virBitmapEqual(tmp->sourceNodes, mem->sourceNodes)) > - continue; > + if (!virBitmapEqual(tmp->sourceNodes, mem->sourceNodes)) > + continue; > + break; > + > + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: > + if (STRNEQ(tmp->path, mem->path)) > + continue; > + break; > + > + case VIR_DOMAIN_MEMORY_MODEL_NONE: > + case VIR_DOMAIN_MEMORY_MODEL_LAST: > + break; > + } > > break; > } > @@ -21705,23 +21738,35 @@ virDomainMemorySourceDefFormat(virBufferPtr buf, > char *bitmap = NULL; > int ret = -1; > > - if (!def->pagesize && !def->sourceNodes) > + if (!def->pagesize && !def->sourceNodes && !def->path) > return 0; > > virBufferAddLit(buf, "<source>\n"); > virBufferAdjustIndent(buf, 2); > > - if (def->sourceNodes) { > - if (!(bitmap = virBitmapFormat(def->sourceNodes))) > - goto cleanup; > + switch ((virDomainMemoryModel) def->model) { > + case VIR_DOMAIN_MEMORY_MODEL_DIMM: > + if (def->sourceNodes) { > + if (!(bitmap = virBitmapFormat(def->sourceNodes))) > + goto cleanup; > > - virBufferAsprintf(buf, "<nodemask>%s</nodemask>\n", bitmap); > + virBufferAsprintf(buf, "<nodemask>%s</nodemask>\n", bitmap); > + } > + > + if (def->pagesize) > + virBufferAsprintf(buf, "<pagesize unit='KiB'>%llu</pagesize>\n", > + def->pagesize); > + break; > + > + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: > + virBufferAsprintf(buf, "<path>%s</path>\n", def->path); > + break; > + > + case VIR_DOMAIN_MEMORY_MODEL_NONE: > + case VIR_DOMAIN_MEMORY_MODEL_LAST: > + break; > } > > - if (def->pagesize) > - virBufferAsprintf(buf, "<pagesize unit='KiB'>%llu</pagesize>\n", > - def->pagesize); > - > virBufferAdjustIndent(buf, -2); > virBufferAddLit(buf, "</source>\n"); > So do you feel there is a need for a check for different path in virDomainMemoryDefCheckABIStability? I know it's not a target thing, but the target/guest uses it by mapping directly... Should there be checks for order? That is src has dimm, dimm, dimm, nvdim while dst has nvdimm, dimm, dimm, dimm? Would that matter? After going through patch 3, I'm beginning to think it wouldn't be such a good idea to keep nvdimm's and dimm's in the same nmems list. I think they might just be too different. I also recall some algorithm somewhere which made sure addresses and sizes used fit "properly"... Searching on DIMM in cscope brings up a few more references and questions. And well - what about migration? > diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h > index 8b26724..4e6a9bf 100644 > --- a/src/conf/domain_conf.h > +++ b/src/conf/domain_conf.h > @@ -1939,6 +1939,7 @@ struct _virDomainRNGDef { > typedef enum { > VIR_DOMAIN_MEMORY_MODEL_NONE, > VIR_DOMAIN_MEMORY_MODEL_DIMM, /* dimm hotpluggable memory device */ > + VIR_DOMAIN_MEMORY_MODEL_NVDIMM, /* nvdimm memory device */ > > VIR_DOMAIN_MEMORY_MODEL_LAST > } virDomainMemoryModel; > @@ -1947,6 +1948,7 @@ struct _virDomainMemoryDef { > /* source */ > virBitmapPtr sourceNodes; > unsigned long long pagesize; /* kibibytes */ > + char *path; > > /* target */ > int model; /* virDomainMemoryModel */ > diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c > index 55df23d..a4cc87f 100644 > --- a/src/qemu/qemu_command.c > +++ b/src/qemu/qemu_command.c > @@ -3393,6 +3393,12 @@ qemuBuildMemoryDeviceStr(virDomainMemoryDefPtr mem) > > break; > > + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: > + virReportError(VIR_ERR_NO_SUPPORT, "%s", > + _("nvdimm not supported yet")); > + return NULL; > + break; > + > case VIR_DOMAIN_MEMORY_MODEL_NONE: > case VIR_DOMAIN_MEMORY_MODEL_LAST: > break; > diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c > index efc46f9..28ee81d 100644 > --- a/src/qemu/qemu_domain.c > +++ b/src/qemu/qemu_domain.c > @@ -5203,6 +5203,7 @@ qemuDomainDefValidateMemoryHotplugDevice(const virDomainMemoryDef *mem, > { > switch ((virDomainMemoryModel) mem->model) { > case VIR_DOMAIN_MEMORY_MODEL_DIMM: > + case VIR_DOMAIN_MEMORY_MODEL_NVDIMM: So the assumption is that NVDIMM will fail the following check? I would think it would want it's own error message like above "not supported yet". Something fortified by the patch 3 code which alters the MemoryDeviceStr to be "pc-dimm" or "nvdimm"... > if (mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_DIMM && > mem->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE) { > virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", > diff --git a/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nvdimm.xml b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nvdimm.xml > new file mode 100644 > index 0000000..e932241 > --- /dev/null > +++ b/tests/qemuxml2argvdata/qemuxml2argv-memory-hotplug-nvdimm.xml > @@ -0,0 +1,49 @@ > +<domain type='qemu'> > + <name>QEMUGuest1</name> > + <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid> > + <maxMemory slots='16' unit='KiB'>1099511627776</maxMemory> > + <memory unit='KiB'>1267710</memory> > + <currentMemory unit='KiB'>1267710</currentMemory> > + <vcpu placement='static' cpuset='0-1'>2</vcpu> > + <os> > + <type arch='i686' machine='pc'>hvm</type> > + <boot dev='hd'/> > + </os> > + <idmap> > + <uid start='0' target='1000' count='10'/> > + <gid start='0' target='1000' count='10'/> > + </idmap> > + <cpu> > + <topology sockets='2' cores='1' threads='1'/> > + <numa> > + <cell id='0' cpus='0-1' memory='219136' unit='KiB'/> > + </numa> > + </cpu> > + <clock offset='utc'/> > + <on_poweroff>destroy</on_poweroff> > + <on_reboot>restart</on_reboot> > + <on_crash>destroy</on_crash> > + <devices> > + <emulator>/usr/bin/qemu</emulator> > + <disk type='block' device='disk'> > + <source dev='/dev/HostVG/QEMUGuest1'/> > + <target dev='hda' bus='ide'/> > + <address type='drive' controller='0' bus='0' target='0' unit='0'/> > + </disk> > + <controller type='ide' index='0'/> > + <controller type='usb' index='0'/> > + <controller type='pci' index='0' model='pci-root'/> > + <input type='mouse' bus='ps2'/> > + <input type='keyboard' bus='ps2'/> > + <memballoon model='virtio'/> > + <memory model='nvdimm'> > + <source> > + <path>/tmp/nvdimm</path> > + </source> > + <target> > + <size unit='KiB'>523264</size> > + <node>0</node> > + </target> > + </memory> > + </devices> > +</domain> > No xml2xml tests? I know there's some "newer" way to use file links from tests/qemuxml2xmloutdata to ../qemuxml2argvdata/. IIRC when I did something similar I had to create the link and git add the link... Ah yes, look at the '*luks-disks* - commit id '9bbf0d7e'. But now you've already done it for the rxqsz patch too, so you're the new expert! John -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list