On Fri, Jul 17, 2015 at 15:40:31 +0100, Daniel Berrange wrote: > On Thu, Jul 16, 2015 at 04:18:08PM +0200, Peter Krempa wrote: > > The scope name, even according to our docs is > > "machine-$DRIVER\x2d$VMNAME.scope" virSystemdMakeScopeName would use the > > resource partition name instead of "machine-" if it was specified thus > > creating invalid scope paths. > > > > This makes libvirt drop cgroups for a VM that uses custom resource > > partition upon reconnecting since the detected scope name would not > > match the expected name generated by virSystemdMakeScopeName. > > > > The error is exposed by the following log entry: > > > > debug : virCgroupValidateMachineGroup:302 : Name 'machine-qemu\x2dtestvm.scope' for controller 'cpu' does not match 'testvm', 'testvm.libvirt-qemu' or 'machine-test-qemu\x2dtestvm.scope' > > > > for a "/machine/test" resource and "testvm" vm. > > This doesn't look right - the whole idea of the resource partition > name is that it overrides the default /machine based path. So > ignoring the resource partition is not right, unless I'm mis- > understanding what this patch is doing. Perhaps I wasn't clear enough in the explanation. The code path that calls the function where I've removed the partition is called on daemon restart, where the paths to cgroups are re-detected upon startup. At first virCgroupNewDetect() is called on a VM's pid and the function detects cgroup paths for the given process. The detected data for a VM with <partition>/machine/asdf/test</partiotion> look like: (gdb) p **group $2 = {path = 0x7fff94000ec0 "", controllers = {{type = 0, mountPoint = 0x7fff94001330 "/sys/fs/cgroup/cpu,cpuacct", linkPoint = 0x7fff94001470 "/sys/fs/cgroup/cpu", placement = 0x7fff94001fb0 "/machine.slice/machine-asdf.slice/machine-asdf-test.slice/machine-qemu\\x2df20live2.scope/emulator"}, {type = 0, mountPoint = 0x7fff94001200 "/sys/fs/cgroup/cpu,cpuacct", linkPoint = 0x7fff94001450 "/sys/fs/cgroup/cpuacct", placement = 0x7fff94002020 "/machine.slice/machine-asdf.slice/machine-asdf-test.slice/machine-qemu\\x2df20live2.scope/emulator"}, {type = 0, mountPoint = 0x7fff94001410 "/sys/fs/cgroup/cpuset", linkPoint = 0x0, placement = 0x7fff94001ad0 "/machine.slice/machine-asdf.slice/machine-asdf-test.slice/machine-qemu\\x2df20live2.scope/emulator"}, ... and so on for other cgroups. After the detection virCgroupValidateMachineGroup() is called on the detected cgroup array to validate them with the following algorithm: FOREACH entry IN 'cgroup' array: Find the last directory name in entry.placement (strrchr(..., '/')). IF the cgroup is one of CPU* cgroups AND the last directory component is '/emulator': find next previous directory component and strip '/emuator' from it # This equals to 'machine-qemu\\x2df20live2.scope' in the example above. # validate the name IF directory name IS EQUAL TO 'MACHINE_NAME' OR 'MACHINE_NAME.libvirt-DRIVER_NAME' OR virSystemdMakeScopeName(...): cgroup is valid Since the algorithm does not extract the partition name or validate it in any way in non-systemd scenarios and in scenarios where systemd is used the generated name wouldn't ever match since the scope name as extracted in the code does not include the slice generated from <partition> the cgroups code does not work on systemd machines after libvirt restart. The patch as-is removes the partition name from the code that generates the systemd SCOPE name since that does not contain it. The partition is used to generate SLICE name which is contained in the cgroup path before the SCOPE. The question that comes into my mind is whether we should actually make sure that the <partition> corresponds to the SLICE in the detected path, or just matching the machine SCOPE is good enough in this case. Peter
Attachment:
signature.asc
Description: Digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list