On Mon, Jul 20, 2015 at 11:10:18AM +0200, Peter Krempa wrote: > On Fri, Jul 17, 2015 at 15:40:31 +0100, Daniel Berrange wrote: > > On Thu, Jul 16, 2015 at 04:18:08PM +0200, Peter Krempa wrote: > > > The scope name, even according to our docs is > > > "machine-$DRIVER\x2d$VMNAME.scope" virSystemdMakeScopeName would use the > > > resource partition name instead of "machine-" if it was specified thus > > > creating invalid scope paths. > > > > > > This makes libvirt drop cgroups for a VM that uses custom resource > > > partition upon reconnecting since the detected scope name would not > > > match the expected name generated by virSystemdMakeScopeName. > > > > > > The error is exposed by the following log entry: > > > > > > debug : virCgroupValidateMachineGroup:302 : Name 'machine-qemu\x2dtestvm.scope' for controller 'cpu' does not match 'testvm', 'testvm.libvirt-qemu' or 'machine-test-qemu\x2dtestvm.scope' > > > > > > for a "/machine/test" resource and "testvm" vm. > > > > This doesn't look right - the whole idea of the resource partition > > name is that it overrides the default /machine based path. So > > ignoring the resource partition is not right, unless I'm mis- > > understanding what this patch is doing. > > Perhaps I wasn't clear enough in the explanation. The code path that > calls the function where I've removed the partition is called on daemon > restart, where the paths to cgroups are re-detected upon startup. > > At first virCgroupNewDetect() is called on a VM's pid and the function > detects cgroup paths for the given process. > > The detected data for a VM with > <partition>/machine/asdf/test</partiotion> look like: > (gdb) p **group > $2 = {path = 0x7fff94000ec0 "", > controllers = {{type = 0, > mountPoint = 0x7fff94001330 "/sys/fs/cgroup/cpu,cpuacct", > linkPoint = 0x7fff94001470 "/sys/fs/cgroup/cpu", > placement = 0x7fff94001fb0 "/machine.slice/machine-asdf.slice/machine-asdf-test.slice/machine-qemu\\x2df20live2.scope/emulator"}, > {type = 0, > mountPoint = 0x7fff94001200 "/sys/fs/cgroup/cpu,cpuacct", > linkPoint = 0x7fff94001450 "/sys/fs/cgroup/cpuacct", > placement = 0x7fff94002020 "/machine.slice/machine-asdf.slice/machine-asdf-test.slice/machine-qemu\\x2df20live2.scope/emulator"}, > {type = 0, > mountPoint = 0x7fff94001410 "/sys/fs/cgroup/cpuset", > linkPoint = 0x0, > placement = 0x7fff94001ad0 "/machine.slice/machine-asdf.slice/machine-asdf-test.slice/machine-qemu\\x2df20live2.scope/emulator"}, > > ... and so on for other cgroups. > > After the detection virCgroupValidateMachineGroup() is called on the > detected cgroup array to validate them with the following algorithm: > > FOREACH entry IN 'cgroup' array: > Find the last directory name in entry.placement (strrchr(..., '/')). > > IF the cgroup is one of CPU* cgroups AND > the last directory component is '/emulator': > > find next previous directory component and strip '/emuator' from it > > > # This equals to 'machine-qemu\\x2df20live2.scope' in the example above. > > # validate the name > > IF directory name IS EQUAL TO 'MACHINE_NAME' OR > 'MACHINE_NAME.libvirt-DRIVER_NAME' OR > virSystemdMakeScopeName(...): > cgroup is valid > > > Since the algorithm does not extract the partition name or validate it > in any way in non-systemd scenarios and in scenarios where systemd is > used the generated name wouldn't ever match since the scope name as > extracted in the code does not include the slice generated from > <partition> the cgroups code does not work on systemd machines after > libvirt restart. > > The patch as-is removes the partition name from the code that generates > the systemd SCOPE name since that does not contain it. The partition is > used to generate SLICE name which is contained in the cgroup path before > the SCOPE. Ok, I understand what's going on now and reproduced the problem locally too. So ACK to your proposed patch. > The question that comes into my mind is whether we should actually make > sure that the <partition> corresponds to the SLICE in the detected path, > or just matching the machine SCOPE is good enough in this case. I'm not sure it matters much - as long as we've detected the cgroup tree we'll do the right thing cleaning it up on guest shutdown, even if the <partition> in XML is different. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list