Re: [PATCH cgroup/for-3.11 1/3] cgroup: mark "tasks" cgroup file as insane

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 06, 2013 at 02:14:10PM -0700, Tejun Heo wrote:
> Hello,
> 
> On Thu, Jun 06, 2013 at 10:20:55AM +0100, Daniel P. Berrange wrote:
> > Unless I'm mistaken there is no alternative that can work. With QEMU
> > we need to apply scheduling controls to 
> > 
> >   1. Individual vCPU threads
> >   2. All non-vCPU threads (ie QEMU's I/O threads)
> > 
> > We can use per-thread APIs for 1, but for 2 we require something that
> > applies to the group of threads as a whole, without also impacting the
> > controls set for the vCPU threads. AFAIK, nothing except cgroups as
> > we use them today can satisfy that requirement ? Am I wrong ? Is there
> > something else that can achieve this same setup ?
> 
> Can you please explain more about your requirements on !vCPU threads?

Well we pretty much needs the tunables available in the cpu, cpuset
and cpuacct controllers to be available for the set of non-vCPU threads
as a group. eg, cpu_shares, cfs_period_us, cfs_quota_us, cpuacct.usage,
cpuacct.usage_percpu, cpuset.cpus, cpuset.mems.

CPU/memory affinity could possibly be done with a combination of
sched_setaffinity + libnuma, but I'm not sure that it has quite
the same semantics. IIUC, with cpuset cgroup changing affinity
will cause the kernel to migrate existing memory allocations to
the newly specified node masks, but this isn't done if you just
use sched_setaffinity/libnuma.

For cpu accounting, you'd have to look at the overall cgroup usage
and then subtract the usage accounted to each vcpu thread to get
the non-vCPU thread group total. Possible but slightly tedious &
more inaccurate since you will have timing delays getting info
from each thread's /proc files

I don't see any way to do cpu_sahres, cfs_period_us and cfs_quota_us
for the group of non-vCPU threads as a whole. You can't set these
at the per-thread level since that is semantically very different.
You can't set these for the process as a whole at the cgroup level,
since that'll confine vCPU threads at the same time which is also
semantically very different.

We need completely separate scheduler tuning for the set of non-vCPU
threads, vs each vCPU thread. AFAICT the only way todo this is with
cgroups, grouping together subsets of threads within the process.

> > I understand that having wildly distinct hiearchies across different
> > controllers causes alot of pain for the kernel. Libvirt doesn't
> > actually require that full level of flexibility though. Our needs
> > are very much simpler. We're happy with the same core hierarchy
> > across all controllers. We just want to be able to create an extra
> > leaf node in some controllers to move threads about. 
> > 
> > It would be fine with us if the kernel required that the same directory
> > hierarchy exists in all controllers, and mandated that threads can only
> > be moved to a directory immediately below where the process is initially
> > placed.
> 
> The problem is that it doesn't make any sense to split threads of the
> same process for at least two major controllers and you end up with
> situation where you can't identify a resource to be belonging to a
> certain cgroup because such level of granularity is simply undefined.
> As I wrote before, we can special case certain controllers but I'm
> extremely reluctant.  If you need it, please convince me.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
_______________________________________________
Containers mailing list
Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/containers




[Index of Archives]     [Cgroups]     [Netdev]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux