On Mon, Nov 14, 2022 at 04:35:35PM +0100, Michal Privoznik wrote:
This is similar to one of previous commits which generated thread-context object for memory devices at cmd line generation phase. This one does the same for hotplug case, except it's doing so iff QEMU sandboxing is turned off. The reason is that once sandboxing is turned on, the __NR_sched_setaffinity syscall is filtered by libseccomp and thus QEMU is unable to instantiate the thread-context object.
I'm not sure this is enough checking for possible errors and I am not sure we actually need this; it looks like it might cause more problems than benefits. I can't think of a reason anyone would hotplug so much memory during runtime that it would benefit from the thread-context, although I admit I might very well be wrong on that. Anyway it can easily fail if the emulator is pinned to anything but a superset of host-nodes set for this particular memory device. There might be more restrictions from mgmt apps filtering some syscalls or whatnot. And there is an issue described below as well.
Signed-off-by: Michal Privoznik <mprivozn@xxxxxxxxxx> --- src/qemu/qemu_hotplug.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index da92ced2f4..5c49da87ba 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -2240,11 +2240,13 @@ qemuDomainAttachMemory(virQEMUDriver *driver, g_autoptr(virJSONValue) devprops = NULL; g_autofree char *objalias = NULL; bool objAdded = false; + bool tcObjAdded = false; bool releaseaddr = false; bool teardownlabel = false; bool teardowncgroup = false; bool teardowndevice = false; g_autoptr(virJSONValue) props = NULL; + g_autoptr(virJSONValue) tcProps = NULL; virObjectEvent *event; int id; int ret = -1; @@ -2273,6 +2275,11 @@ qemuDomainAttachMemory(virQEMUDriver *driver, priv, vm->def, mem, true, false) < 0) goto cleanup; + /* In case sandbox was turned on, thread-context won't work. */ + if (cfg->seccompSandbox == 0 && + qemuBuildThreadContextProps(&tcProps, &props, priv) < 0) + goto cleanup; +
This function adds the alias to the list of tc aliases...
if (qemuProcessBuildDestroyMemoryPaths(driver, vm, mem, true) < 0) goto cleanup; @@ -2294,6 +2301,12 @@ qemuDomainAttachMemory(virQEMUDriver *driver, goto removedef; qemuDomainObjEnterMonitor(vm); + if (tcProps) { + if (qemuMonitorAddObject(priv->mon, &tcProps, NULL) < 0) + goto exit_monitor; + tcObjAdded = true;
If this (or anything above) fails the tcObjAdded is false (because it was not added, kinda makes sense. The hotplug fails, but if the user tries to hotplug again and it does work, then...
+ } + if (qemuMonitorAddObject(priv->mon, &props, NULL) < 0) goto exit_monitor; objAdded = true; @@ -2301,6 +2314,12 @@ qemuDomainAttachMemory(virQEMUDriver *driver, if (qemuMonitorAddDeviceProps(priv->mon, &devprops) < 0) goto exit_monitor; + if (tcObjAdded) { + if (qemuProcessDeleteThreadContext(vm) < 0) + goto exit_monitor;
This will fail because there is one leftover tc alias from the first hotplug attempt, even though this second hotplug was completely successful. Either remove it from the list if anything is unsuccessful (i.e. (tcProps && !tcObjAdded), add it to the list only when the object is added, or ignore the return value even in this call (or inside the qemuProcessDeleteThreadContext() function), so that we don't have corner cases of weird errors.
+ tcObjAdded = false; + } + qemuDomainObjExitMonitor(vm); event = virDomainEventDeviceAddedNewFromObj(vm, objalias); @@ -2339,6 +2358,8 @@ qemuDomainAttachMemory(virQEMUDriver *driver, virErrorPreserveLast(&orig_err); if (objAdded) ignore_value(qemuMonitorDelObject(priv->mon, objalias, false)); + if (tcObjAdded) + ignore_value(qemuProcessDeleteThreadContext(vm)); qemuDomainObjExitMonitor(vm); if (objAdded && mem) @@ -4380,6 +4401,7 @@ qemuDomainRemoveMemoryDevice(virQEMUDriver *driver, qemuDomainObjEnterMonitor(vm); rc = qemuMonitorDelObject(priv->mon, backendAlias, true); + /* XXX remove TC object */
Leftover? Or did you mean to call DeleteThreadContext here too? It should be safe to remove all the leftover ones here, or if not, just removing "tc-{backendAlias}" here should do.
qemuDomainObjExitMonitor(vm); virDomainAuditMemory(vm, oldmem, newmem, "update", rc == 0); -- 2.37.4
Attachment:
signature.asc
Description: PGP signature