On Thu, Jul 31, 2014 at 01:13:19PM +0200, Ján Tomko wrote: > Hello developers! > > Currently, our default cgroup layout is: > -top level cgroup > \-machine (machine.slice with systemd) > `-vm1.libvirt-qemu (machine-qemu\x2dvm1.scope with systemd) > `-emulator > `-vcpu0 > \-vcpu1 > \-vm2.libvirt-qemu > `-emulator > `-vcpu0 > `-vcpu1 > > To free some CPUs for exclusive use, either all processes from the top level > cgroup should be moved to another one (which does not seem like a great idea) > or isolcpus= should be specified on the kernel command line. IIUC when you say 'exclusive use' here you are basically aiming to strictly separate all QEMU processes from all general OS processes. So, yes, in this case isolcpus is a fairly natural way to achieve this. On a 4 NUMA node system with 4 CPUs in each node, you might set isolcpus=0-3, so the OS is confined to the first NUMA node. You'd then have CPUS 4->15 (in NUMA nodes 1-3) for use by VMs. > The cpuset.cpu_exclusive option can be set on a cgroup if > * all the groups up to the top level group have it set > * the cpuset of the current group is a subset of the parent group > and no siblings use any cpus from the current cpuset > > This would mean that to keep the existing nested structure, all vcpus and the > emulator thread would need to have an exclusive CPU, e.g: > <vcpu placement='static' cpuset='4-6'>2</vcpu> > <cputune exclusive='yes'> > <vcpupin vcpu='0' cpuset='5'/> > <vcpupin vcpu='1' cpuset='6'/> > <emulatorpin cpuset='4'/> > </cputune> > > (The only two issues I found: > 1) libvirt would have to mess with systemd's 'machine-scope' behind it's back > (setting cpu_exclusive) Bear in mind that the end goal with cgroups is that libvirt will not touch the cgroup filesystem at all. The intent is that we will use DBus APIs from systemd for setting anything cgroups related. So I think we'd need to determine what's systemd maintainers thoughts are wrt to cpuset cpu_exclusive before going down this route. > 2) creating machines without explicit cpu pinning fails, as libvirt tries to > write all the cpus to the cpuset, even those the other machine uses > exclusively) To me, not specifying any CPU pinning in the XML implies that libvirt will use the "default placement" of the OS. This need not mean "all CPUs". So if the cgroups CPU set against the machine.slice has restricted what CPUs are available to VMs, libvirt should be taking care to honour that. IOW, we should not blindly write 1s to all CPUs - we should probably read the available CPU set from the cgroup that we are going to place the VM under to determine what's available. > I've also thought about just keeping track of the 'exclusived' CPUs in > libvirt. This would not work across drivers. And it could possibly be needed > to solve issue 2). > > Do you think any of these options would be useful? > > Bug: https://bugzilla.redhat.com/show_bug.cgi?id=996758 Broadly speaking I believe that the job of isolating the host OS processes onto a subset of CPUs, separate from those available to VMs, is for the admin todo and out of scope for libvirt. So I think that libvirt needs to be capable of working with both approaches you mention above 1. kernel booted with isolcpus. - Nothing in XML => VMs will only run on CPUs listed in isolcpus - Affinity in XML => VMs will be moved into the listed CPUs (which can be different from those in isolcpus) 2. machine.slice given a restricted cpuset.cpus (regardless of whether cpuset.cpu_exclusive is 0 or 1) - Nothing in XML => VMs must honour the cpuset.cpus in machine.sice - Affinity in XML => VMs will be moved into listed CPUs (which must be a subset of cpuset.cpus) I'd guess this all broadly works already, with exception of the the bug we talk about above where libvirt tries to pin VM to all CPUs if none are listed, instead of honouring cpuset.cpus in the cgroup used. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list