On Tue, Jun 04, 2013 at 10:34:44AM -0400, Vivek Goyal wrote: > On Tue, Jun 04, 2013 at 12:15:56PM +0100, Daniel P. Berrange wrote: > > On Mon, Jun 03, 2013 at 07:13:02PM -0700, Tejun Heo wrote: > > > Some resources controlled by cgroup aren't per-task and cgroup core > > > allowing threads of a single thread_group to be in different cgroups > > > forced memcg do explicitly find the group leader and use it. This is > > > gonna be nasty when transitioning to unified hierarchy and in general > > > we don't want and won't support granularity finer than processes. > > > > With libvirt and KVM we require the ability to put different threads > > in different cgroups for the "cpu", "cpuset" & "cpuacct" controllers. > > This is to allow us to control schedular tunables / placement for > > QEMU vCPU threads, independantly of limits for QEMU I/O threads. So > > requiring all threads of a process to be in the same cgroup isn't > > sufficiently flexible for our needs. > > For placement of vCPU threads, can we set per thread cpu affinity > (sched_setaffinity()), instead of using cgroups for that purpose. sched_setaffinity can't overrride affinity already set in the cgroup. So this won't allow for disjoint affinity sets between threads. ie if you use cgroups to bind the process to pCPU 1 (to apply all possible non-vCPU threads) and then want to bind vCPU threads to pCPU 2 you can't do it. eg for cpu/cpuacct/cpuset controllers we have a setup <domain cgroup> 0 threads | +- vcpu0 1 thread +- vcpu1 1 thread +- emulator n threads and want complete independance in settings for each of these child cgroups. > Apart from cpu affinity, what scheduling parameters we want different > between different threads. Placement isn't the big deal - it is really the cpu.cfs_period_us, cpu.cfs_quota_us and cpu.shares settings that are important ones, along with cpuacct.{stat,usage,usage_percpu} to track utilization across multiple threads. For cpuacct, if we only had 1 cgroup for all threads, we'd have to read the process's overall usage and then subtract usage of individual threads. This would really be a step backwards, throwing away the benefits that cgroups brought in allowing setup arbitrary grouping of tasks :-( Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers