On Fri, 21.10.16 11:19, Daniel P. Berrange (berrange@xxxxxxxxxx) wrote: > On Thu, Oct 20, 2016 at 02:59:45PM -0400, Tejun Heo wrote: > > (reposting w/ libvir-list cc'd, sorry about the delay in reposting, > > was traveling and then on vacation) > > > > Hello, Daniel. How have you been? > > > > We (facebook) are deploying cgroup v2 and internally use libvirt to > > manage virtual machines, so I'm trying to add cgroup v2 support to > > libvirt. > > > > Because cgroup v2's resource configurations differ from v1 in varying > > degrees depending on the specific resource type, it unfortunately > > introduces new configurations (some completely new configs, others > > just a different range / format). This means that adding cgroup v2 > > support to libvirt requires adding new config options to it and maybe > > implementing some form of translation mechanism between overlapping > > configs. > > > > The upcoming systemd release includes all that's necessary to support > > v1/v2 compatibility so that users setting resource configs through > > systemd don't have to worry about whether v1 or v2 is in use. I'm > > wondering whether it would make sense to make libvirt use dbus calls > > to systemd to set resource configs when systemd is in use, so that it > > can piggyback on systemd's v1/v2 compatibility. > > The big question I have around cgroup v2 is state of support for all > controllers that libvirt uses (cpu, cpuacct, cpuset, memory, devices, > freezer, blkio). IIUC, not all of these have been ported to cgroup > v2 setup and the cpu port in particular was rejected by Linux maintainers. > Libvirt has a general policy that we won't support features that only > exist in out of tree patches (applies to kernel and any other software > we build against or use). > > IIRC from earlier discussions, the model for dealing with processes in > cgroup v2 was quite different. In libvirt we rely on the ability to > assign different threads within a process to different cgroups, because > we need to control CPU schedular parameters on different threads in > QEMU. eg we have vCPU threads, I/O threads and general emulator threads > each of which get different policies. > > When I spoke with Lennart about cgroup v2, way back in Jan, he indicated > that while systemd can technically work with a system where some > controllers are mounted as v1, while others are mounted as v2, this > would not be an officially supported solution. Thus systemd in Fedora > was not likely to switch to v2 until all required controllers could use > v2. I'm not sure if this still corresponds to Lennarts current views, so > CC'ing him to confirm/deny. So, the "hybrid" mode is probably nothing RHEL or so would want to support. However, I think it might be a good step for Fedora at least. But yes, supporting this mode means additional porting effort for the various daemons that access cgroupfs... > I recall that systemd policy for v2 was inteded to be that no app > should write to cgroup sysfs except for systemd, unless there was > a sub-tree created with Delegate=yes set on the scope. So this clearly > means when using v2 we'll have to use the systemd DBus APIs for managing > cgroups v2 on such hosts. Yes, this is our policy: the cgroup tree is private property of systemd (at least regarding write access), except when your have a service or scope unit where Delegate=yes is set, in which case you can manage your own subtree of that freely. Lennart -- Lennart Poettering, Red Hat -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list