On Mon, Feb 05, 2018 at 10:40:58 +0100, Martin Kletzander wrote: > On Fri, Feb 02, 2018 at 06:05:56PM +0100, Peter Krempa wrote: > > On Fri, Feb 02, 2018 at 15:27:22 +0100, Martin Kletzander wrote: > > > Since one of the things in capabilities (info from resctrl updated with data > > > about caches) can be change on the system by remounting the /sys/fs/resctrl with > > > different options, the capabilities need to be refreshed. There is a better fix > > > in the works, but it's going to be way bigger than this (hence the XXX note > > > there), so for the time being let's workaround this. And in order not to slow > > > down the domain starting, only get the capabilities if there are any cachetunes. > > > > > > Relates-to: https://bugzilla.redhat.com/show_bug.cgi?id=1540780 > > > > > > Signed-off-by: Martin Kletzander <mkletzan@xxxxxxxxxx> > > > --- > > > src/qemu/qemu_process.c | 8 +++++++- > > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > > > diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c > > > index 5a364730c8c1..2ab242b7634d 100644 > > > --- a/src/qemu/qemu_process.c > > > +++ b/src/qemu/qemu_process.c > > > @@ -2514,9 +2514,15 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver, > > > { > > > int ret = -1; > > > size_t i = 0; > > > - virCapsPtr caps = virQEMUDriverGetCapabilities(driver, false); > > > + virCapsPtr caps = NULL; > > > qemuDomainObjPrivatePtr priv = vm->privateData; > > > > > > + if (!vm->def->ncachetunes) > > > + return 0; > > > + > > > + /* Force capability refresh since resctrl info can change > > > + * XXX: move cache info into virresctrl so caps are not needed */ > > > + caps = virQEMUDriverGetCapabilities(driver, true); > > > > I hope that your upcoming fix will also deal with the hugepage and NUMa > > data which is more often used than this. > > > > Actually, it will just deal with resctrl info being as consistent as > possible. I mean you can always remount the directory between libvirtd > reading the two files, there will always be a window of opportunity. > The fix I have in mind (and code) is just going to make rid of the > capability usage from the driver in resctrl-related code. Do you have > any pointers to the numa and hugepage faults that we ought to fix? For the hugepage code I don't really know the internals which are wrong, but the high-level symptom is similar/same to what you have with resctl. If you mount your hugepage mounts (or modify them) after starting libvirtd, your hugepage-backed guests will fail to start until libvirtd is restarted. In the NUMA/capability code the bug is that if you enable/disable any host cpus, we use stale data when translating a numa node map into a full cpumap in virCapabilitiesGetCpusForNode. Cgroups code is not happy when attempting to pin a thread to a non-existing cpu.] > Maybe it can be dealt with the same way. Possibly. It probably isn't very expensive to iterate the numa node maps. I'm not sure about the hugepage mounts, but in that case the guest is probably going to be huge, so some overhead might be okay. > > > My feelings towards doing this are slightly mixed, but ACK > > Thanks, pinky swear that I'll remove it in near future ;)
Attachment:
signature.asc
Description: PGP signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list