On Fri, Sep 04, 2020 at 02:22:14PM +0200, Martin Kletzander wrote: > At least in a particular scenario described in the code. Basically when > libvirtd is running without CAP_SYS_NICE (e.g. in a container) and it is trying > to set QEMU affinity to all CPUs (because there is no setting requested in the > XML) it fails. But if we ignore the failure in this particular case than you > can limit the CPUs used by controlling the affinity for libvirtd itself. > > In any other case (anything requested in the XML, pinning a live domain, etc.) > the call is still considered fatal and the action errors out. > > Resolves: https://bugzilla.redhat.com/1819801 I'd prefer if this commit message outlined the reason why this change is ok, instead of just pointing to the BZ eg add the following text: Consider a host with 8 CPUs. There are the following possible scenarios 1. Bare metal; libvirtd has affinity of 8 CPUs; QEMU should get 8 CPUs 2. Bare metal; libvirtd has affinity of 2 CPUs; QEMU should get 8 CPUs 3. Container has affinity of 8 CPUs; libvirtd has affinity of 8 CPus; QEMU should get 8 CPUs 4. Container has affinity of 8 CPUs; libvirtd has affinity of 2 CPus; QEMU should get 8 CPUs 5. Container has affinity of 4 CPUs; libvirtd has affinity of 4 CPus; QEMU should get 4 CPUs 6. Container has affinity of 4 CPUs; libvirtd has affinity of 2 CPus; QEMU should get 4 CPUs Scenarios 1 & 2 always work unless systemd restricted libvirtd privs. Scenario 3 works because libvirt checks current affinity first and skips the sched_setaffinity call, avoiding the SYS_NICE issue Scenario 4 works only if CAP_SYS_NICE is availalbe Scenarios 5 & 6 works only if CAP_SYS_NICE is present *AND* the cgroups cpuset is not set on the container. If libvirt blindly ignores the sched_setaffinity failure, then scenarios 4, 5 and 6 should all work, but with caveat in case 4 and 6, that QEMU will only get 2 CPUs instead of the possible 8 and 4 respectively. This is still better than failing. Therefore libvirt can blindly ignore the setaffinity failure, but *ONLY* ignore it when there was no affinity specified in the XML config. If user specified affinity explicitly, libvirt must report an error if it can't be honoured. > > Suggested-by: Daniel P. Berrangé <berrange@xxxxxxxxxx> > Signed-off-by: Martin Kletzander <mkletzan@xxxxxxxxxx> > --- > src/qemu/qemu_process.c | 41 ++++++++++++++++++++++++++++++++++++++--- > 1 file changed, 38 insertions(+), 3 deletions(-) > > diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c > index cfe09d632633..270bb37d3682 100644 > --- a/src/qemu/qemu_process.c > +++ b/src/qemu/qemu_process.c > @@ -2528,6 +2528,7 @@ qemuProcessGetAllCpuAffinity(virBitmapPtr *cpumapRet) > static int > qemuProcessInitCpuAffinity(virDomainObjPtr vm) > { > + bool settingAll = false; > g_autoptr(virBitmap) cpumapToSet = NULL; > virDomainNumatuneMemMode mem_mode; > qemuDomainObjPrivatePtr priv = vm->privateData; > @@ -2566,13 +2567,30 @@ qemuProcessInitCpuAffinity(virDomainObjPtr vm) > if (!(cpumapToSet = virBitmapNewCopy(vm->def->cputune.emulatorpin))) > return -1; > } else { > + settingAll = true; > if (qemuProcessGetAllCpuAffinity(&cpumapToSet) < 0) > return -1; > } > > if (cpumapToSet && > virProcessSetAffinity(vm->pid, cpumapToSet) < 0) { > - return -1; > + /* > + * We only want to error out if we failed to set the affinity to > + * user-requested mapping. If we are just trying to reset the affinity > + * to all CPUs and this fails it can only be an issue if: > + * 1) libvirtd does not have CAP_SYS_NICE > + * 2) libvirtd does not run on all CPUs > + * > + * However since this scenario is very improbable, we rather skip > + * reporting the error because it helps running libvirtd in a a scenario > + * where pinning is handled by someone else. I wouldn't call this scenario "improbably" - it is entirely expected by some of our users. Replace these three lines with "This scenario can easily occurr when libvirtd is run inside a container with restrictive permissions and CPU pinning" With the text changes Reviewed-by: Daniel P. Berrangé <berrange@xxxxxxxxxx> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|