Re: [PATCH v8 3/6] cpuset: Add cpuset.sched.load_balance flag to v2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/28/2018 08:45 AM, Peter Zijlstra wrote:
> On Thu, May 24, 2018 at 02:55:25PM -0400, Waiman Long wrote:
>> On 05/24/2018 11:43 AM, Peter Zijlstra wrote:
>>> I'm confused... why exactly do we have both domain and load_balance ?
>> The domain is for partitioning the CPUs only. It doesn't change the load
>> balancing state. So the load_balance flag is still need to turn on and
>> off load balancing.
> OK, so we have to two boolean flags, giving 4 possible states. Lets just
> go through them one by on:
>
> A) domain:0 load_balance:0 -- we have no exclusive domain, but have
>    load-balancing disabled across them. AFAICT this should be an invalid
>    state.
>
> B) domain:0 load_balance:1 -- we have no exclusive domain, but have
>    load-balancing enabled. AFAICT this is the default state and is a
>    no-op.
>
> C) domain:1 load_balance:0 -- we have an exclusive domain, and have
>    load-balancing disabled across it. This is, AFAICT, identical to
>    having a bunch of sub/sibling groups each with a single CPU domain.
>
> D) domain:1 load_balance:1 -- we have an exclusive domain, and have
>    load-balancing enabled. This is a partition.
>
> Now, I think I've overlooked the fact that load_balance==1 only really
> means something when the parent's load_balance==0, but I'm not sure that
> really changes anything.
>
> So, afaict, the above only have two useful states: B and D. Which again
> raises the question, why two knobs? What useful configurations does it
> allow?

I am working on the v9 patch, and below is the current draft of the
documentation. Hopefully that will clarify some of the concepts that we
are discussing here.

  cpuset.sched.domain_root
        A read-write single value file which exists on non-root
        cpuset-enabled cgroups.  It is a binary value flag that accepts
        either "0" (off) or "1" (on).  This flag is set by the parent
        and is not delegatable.

        If set, it indicates that the current cgroup is the root of a
        new scheduling domain or partition that comprises itself and
        all its descendants except those that are scheduling domain
        roots themselves and their descendants.  The root cgroup is
        always a scheduling domain root.

        There are constraints on where this flag can be set.  It can
        only be set in a cgroup if all the following conditions are true.

        1) The "cpuset.cpus" is not empty and the list of CPUs are
           exclusive, i.e. they are not shared by any of its siblings.
        2) The parent cgroup is also a scheduling domain root.
        3) There is no child cgroups with cpuset enabled.  This is
           for eliminating corner cases that have to be handled if such
           a condition is allowed.

        Setting this flag will take the CPUs away from the effective
        CPUs of the parent cgroup.  Once it is set, this flag cannot
        be cleared if there are any child cgroups with cpuset enabled.
        Further changes made to "cpuset.cpus" is allowed as long as
        the first condition above is still true.

        A parent scheduling domain root cgroup cannot distribute all
        its CPUs to its child scheduling domain root cgroups unless
        its load balancing flag is turned off.

  cpuset.sched.load_balance
        A read-write single value file which exists on non-root
        cpuset-enabled cgroups.  It is a binary value flag that accepts
        either "0" (off) or "1" (on).  This flag is set by the parent
        and is not delegatable.  It is on by default in the root cgroup.

        When it is on, tasks within this cpuset will be load-balanced
        by the kernel scheduler.  Tasks will be moved from CPUs with
        high load to other CPUs within the same cpuset with less load
        periodically.

        When it is off, there will be no load balancing among CPUs on
        this cgroup.  Tasks will stay in the CPUs they are running on
        and will not be moved to other CPUs.

        The load balancing state of a cgroup can only be changed on a
        scheduling domain root cgroup with no cpuset-enabled children.
        All cgroups within a scheduling domain or partition must have
        the same load balancing state.  As descendant cgroups of a
        scheduling domain root are created, they inherit the same load
        balancing state of their root.

The main purpose of using a new domain_root flag is to enable user to
create new partitions without the trick of disabling load_balance in the
parent and enabling it in the child. Now, we can create as many
partitions as we want without ever turning off load balancing in any of
the cpusets. I find it to be more straight forward and easier to
understand than using the load_balance trick.

Of course, turning off load balancing is still useful in some use cases,
so it is supported. To simplify thing, it is mandated that all the
cpusets within a partition must have the same load balancing state. This
is to ensure that we can't use the load_balance trick to create
additional partition underneath it. The domain_root flag is the only way
to create partition.

A) domain_root: 0, load_balance: 0 -- a non-domain root cpuset within a
no load balancing partition.

B) domain_root: 0, load_balance: 1 -- a non-domain root cpuset within a
load balancing partition.

C) domain_root: 1, load_balance: 0 -- a domain root cpuset of a no load
balancing partition.

D) domain_root: 1, load_balance: 1 -- a domain root cpuset of a load
balancing partition.

Hope this help.

Cheers,
Longman



--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux