On Wed, Nov 07, 2018 at 10:47:01AM +0100, Michal Privoznik wrote:
On 11/07/2018 12:43 AM, John Ferlan wrote:On 11/5/18 9:49 AM, Michal Privoznik wrote:https://bugzilla.redhat.com/show_bug.cgi?id=1624223 There are two ways to request memory preallocation on cmd line: -mem-prealloc and .prealloc attribute to memory-backend-file.s/to/for a/ ?However, as it turns out it's not safe to use both at the same time. Prefer -mem-prealloc as it is more backward compatible compared to switching to "-numa node,memdev= + -object memory-backend-file".FWIW: Issue introduced by commit 1c4f3b56.. While I understand the reasoning, it's really too bad we couldn't "move" the determination over which conflicting qualifier is used to earlier. By the time we call the -numa backend we would already have had to make the choice if I'm reading the ordering right.Correct, you're reading it right.But if it doesn't matter for the -numa object to use the -mem-prealloc, then who am I to complain. Of course the "future thinking" me that is living in the present issues surrounding machine and pc makes me wonder if choosing this as the default going forward into the future where someone could deprecate the -mem-prealloc because -numa will be so prevelant won't bite us down the road.If -mem-prealloc is deprecated then we would have to construct -object memory-backend-file. I'm not against this, but IIRC this fails during migration. I mean, if you have a guest that uses -mem-path you can't migrate it to -object memory-backing-file because qemu would fail to load the migration stream. That is why we have @needBackend in qemuBuildNumaArgStr(), so that new cmd line is built iff really needed. This is the reason I went this way even though BZ suggests otherwise.Curious how others feel - I'm not against this choice, just trying to supply an opposing/differing viewpoint. We really have to start coding for the future and consider what deprecation could mean especially for arguments that essentially mean the same thing.Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx> --- src/qemu/qemu_command.c | 37 +++++++++++++------ src/qemu/qemu_command.h | 1 + src/qemu/qemu_domain.c | 2 + src/qemu/qemu_domain.h | 3 ++ src/qemu/qemu_hotplug.c | 3 +- .../hugepages-numa-default-dimm.args | 2 +- 6 files changed, 35 insertions(+), 13 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index e338d3172e..0294030f0e 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3123,6 +3123,7 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd, * @def: domain definition object * @mem: memory definition object * @autoNodeset: fallback nodeset in case of automatic NUMA placement + * @forbidPrealloc: don't set prealloc attributeSlight bikeshed, but this changes the priv->memAlloc to @forbidPrealloc which is IMO a bit odd.Okay, what name do you suggest? My reasoning for the name was that it should make sense from the function POV. That's why calling the variable 'memAlloc' did not make sense to me.Beyond that, this becomes the 3rd @priv field to be passed along... Maybe @priv should just be passed to access qemuCaps, autoNodeset, and memPrealloc.Ah sure.* @force: forcibly use one of the backends * * Creates a configuration object that represents memory backend of given guest @@ -3136,6 +3137,9 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd, * Then, if one of the two memory-backend-* should be used, the @qemuCaps is * consulted to check if qemu does support it. * + * If @forbidPrealloc is true then 'prealloc' attribute of the backend is not + * set. This may come handy when global -mem-prealloc is already specified. + * * Returns: 0 on success, * 1 on success and if there's no need to use memory-backend-* * -1 on error. @@ -3148,6 +3152,7 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps, virDomainDefPtr def, virDomainMemoryDefPtr mem, virBitmapPtr autoNodeset, + bool forbidPrealloc, bool force) { const char *backendType = "memory-backend-file"; @@ -3265,11 +3270,13 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps, if (mem->nvdimmPath) { if (VIR_STRDUP(memPath, mem->nvdimmPath) < 0) goto cleanup; - prealloc = true; + if (!forbidPrealloc) + prealloc = true; } else if (useHugepage) { if (qemuGetDomainHupageMemPath(def, cfg, pagesize, &memPath) < 0) goto cleanup; - prealloc = true; + if (!forbidPrealloc) + prealloc = true; } else { /* We can have both pagesize and mem source. If that's the case, * prefer hugepages as those are more specific. */ @@ -3398,7 +3405,8 @@ qemuBuildMemoryCellBackendStr(virDomainDefPtr def, mem.info.alias = alias; if ((rc = qemuBuildMemoryBackendProps(&props, alias, cfg, priv->qemuCaps, - def, &mem, priv->autoNodeset, false)) < 0) + def, &mem, priv->autoNodeset, + priv->memPrealloc, false)) < 0) goto cleanup; if (virQEMUBuildObjectCommandlineFromJSON(buf, props) < 0) @@ -3435,7 +3443,8 @@ qemuBuildMemoryDimmBackendStr(virBufferPtr buf, goto cleanup; if (qemuBuildMemoryBackendProps(&props, alias, cfg, priv->qemuCaps, - def, mem, priv->autoNodeset, true) < 0) + def, mem, priv->autoNodeset, + priv->memPrealloc, true) < 0) goto cleanup; if (virQEMUBuildObjectCommandlineFromJSON(buf, props) < 0) @@ -7443,7 +7452,8 @@ qemuBuildSmpCommandLine(virCommandPtr cmd, static int qemuBuildMemPathStr(virQEMUDriverConfigPtr cfg, const virDomainDef *def, - virCommandPtr cmd) + virCommandPtr cmd, + qemuDomainObjPrivatePtr priv) { const long system_page_size = virGetSystemPageSizeKB(); char *mem_path = NULL; @@ -7465,8 +7475,10 @@ qemuBuildMemPathStr(virQEMUDriverConfigPtr cfg, return 0; } - if (def->mem.allocation != VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE) + if (def->mem.allocation != VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE) { virCommandAddArgList(cmd, "-mem-prealloc", NULL); + priv->memPrealloc = true; + } virCommandAddArgList(cmd, "-mem-path", mem_path, NULL); VIR_FREE(mem_path); @@ -7479,7 +7491,8 @@ static int qemuBuildMemCommandLine(virCommandPtr cmd, virQEMUDriverConfigPtr cfg, const virDomainDef *def, - virQEMUCapsPtr qemuCaps) + virQEMUCapsPtr qemuCaps, + qemuDomainObjPrivatePtr priv) { if (qemuDomainDefValidateMemoryHotplug(def, qemuCaps, NULL) < 0) return -1; @@ -7498,15 +7511,17 @@ qemuBuildMemCommandLine(virCommandPtr cmd, virDomainDefGetMemoryInitial(def) / 1024); } - if (def->mem.allocation == VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE) + if (def->mem.allocation == VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE) { virCommandAddArgList(cmd, "-mem-prealloc", NULL); + priv->memPrealloc = true; + }I find it "confusing" that setting memPrealloc = true when "def->mem.allocation == VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE"; however, in qemuBuildMemPathStr it's a != comparison. I know it's existing, but strange.This is so that -mem-prealloc is not added twice onto the cmd line. The first addition is done here and the second is done possibly in qemuBuildMemPathStr ..
Yeah, the memory code is... I have no words for that. The problem is there is not much re-factorings that would work in all freaking corner cases that we (unfortunately) decided to cover. It's as Michal says, on this occasion the mempath should be added here unconditionally, but the condition is there to make it not appear twice. Also it needs to be added there before the memory objects so setting a variable and then acting upon that is not the right refactor (I hope I remember that correctly). If you put it in the caller, then you get to the point it's not the only one... I don't remember how the second caller handles the prealloc, but I remember we tried bunch of refactors and there's not much to do until we deprecate some QEMU versions.
Again, I'm not against this, but would like to see if someone with more numa experience chimes in (Martin?) and whether we need to think more in terms of what deprecation could mean.It would mean inability to migrate to newer libvirt.
Well, since we deprecated some QEMU versions (finally), we can just add a flag in the migration cookie that will tell us if the current libvirt version is preferring the .prealloc and just start using that for newly started VMs. Migrating to older version won't work, but that's not supported. Unless exceptions of course, but anyone can handle that by backporting the support for the flag if that's your (or your distro's vendor's) need. TL;DR: The pre-existing condition is actually fine, unfortunately.
John/* * Add '-mem-path' (and '-mem-prealloc') parameter here if * the hugepages and no numa node is specified. */ if (!virDomainNumaGetNodeCount(def->numa) && - qemuBuildMemPathStr(cfg, def, cmd) < 0) + qemuBuildMemPathStr(cfg, def, cmd, priv) < 0).. called here. Michal -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list
Attachment:
signature.asc
Description: Digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list