Re: [RFC PATCH 0/5] cgroup/cpuset: A new "isolcpus" paritition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

The following is the proposed text for "cpuset.cpus.reserve" and "cpuset.cpus.partition" of the new cpuset partition in Documentation/admin-guide/cgroup-v2.rst.

  cpuset.cpus.reserve
    A read-write multiple values file which exists only on root
    cgroup.

    It lists all the CPUs that are reserved for adjacent and remote
    partitions created in the system.  See the next section for
    more information on what an adjacent or remote partitions is.

    Creation of adjacent partition does not require touching this
    control file as CPU reservation will be done automatically.
    In order to create a remote partition, the CPUs needed by the
    remote partition has to be written to this file first.

    A "+" prefix can be used to indicate a list of additional
    CPUs that are to be added without disturbing the CPUs that are
    originally there.  For example, if its current value is "3-4",
    echoing ""+5" to it will change it to "3-5".

    Once a remote partition is destroyed, its CPUs have to be
    removed from this file or no other process can use them.  A "-"
    prefix can be used to remove a list of CPUs from it.  However,
    removing CPUs that are currently used in existing partitions
    may cause those partitions to become invalid.  A single "-"
    character without any number can be used to indicate removal
    of all the free CPUs not allocated to any partitions to avoid
    accidental partition invalidation.

  cpuset.cpus.partition
    A read-write single value file which exists on non-root
    cpuset-enabled cgroups.  This flag is owned by the parent cgroup
    and is not delegatable.

    It accepts only the following input values when written to.

      ==========    =====================================
      "member"    Non-root member of a partition
      "root"    Partition root
      "isolated"    Partition root without load balancing
      ==========    =====================================

    A cpuset partition is a collection of cgroups with a partition
    root at the top of the hierarchy and its descendants except
    those that are separate partition roots themselves and their
    descendants.  A partition has exclusive access to the set of
    CPUs allocated to it.  Other cgroups outside of that partition
    cannot use any CPUs in that set.

    There are two types of partitions - adjacent and remote.  The
    parent of an adjacent partition must be a valid partition root.
    Partition roots of adjacent partitions are all clustered around
    the root cgroup.  Creation of adjacent partition is done by
    writing the desired partition type into "cpuset.cpus.partition".

    A remote partition does not require a partition root parent.
    So a remote partition can be formed far from the root cgroup.
    However, its creation is a 2-step process.  The CPUs needed
    by a remote partition ("cpuset.cpus" of the partition root)
    has to be written into "cpuset.cpus.reserve" of the root
    cgroup first.  After that, "isolated" can be written into
    "cpuset.cpus.partition" of the partition root to form a remote
    isolated partition which is the only supported remote partition
    type for now.

    All remote partitions are terminal as adjacent partition cannot
    be created underneath it.

    The root cgroup is always a partition root and its state cannot
    be changed.  All other non-root cgroups start out as "member".

    When set to "root", the current cgroup is the root of a new
    partition or scheduling domain.

    When set to "isolated", the CPUs in that partition will
    be in an isolated state without any load balancing from the
    scheduler.  Tasks placed in such a partition with multiple
    CPUs should be carefully distributed and bound to each of the
    individual CPUs for optimal performance.

    The value shown in "cpuset.cpus.effective" of a partition root is
    the CPUs that are dedicated to that partition and not available
    to cgroups outside of that partittion.

    A partition root ("root" or "isolated") can be in one of the
    two possible states - valid or invalid.  An invalid partition
    root is in a degraded state where some state information may
    be retained, but behaves more like a "member".

    All possible state transitions among "member", "root" and
    "isolated" are allowed.

    On read, the "cpuset.cpus.partition" file can show the following
    values.

      ============================= =====================================
      "member"            Non-root member of a partition
      "root"            Partition root
      "isolated"            Partition root without load balancing
      "root invalid (<reason>)"    Invalid partition root
      "isolated invalid (<reason>)"    Invalid isolated partition root
      ============================= =====================================

    In the case of an invalid partition root, a descriptive string on
    why the partition is invalid is included within parentheses.

    For an adjacent partition root to be valid, the following
    conditions must be met.

    1) The "cpuset.cpus" is exclusive with its siblings , i.e. they
       are not shared by any of its siblings (exclusivity rule).
    2) The parent cgroup is a valid partition root.
    3) The "cpuset.cpus" is not empty and must contain at least
       one of the CPUs from parent's "cpuset.cpus", i.e. they overlap.
    4) The "cpuset.cpus.effective" cannot be empty unless there is
       no task associated with this partition.

    For a remote partition root to be valid, the following conditions
    must be met.

    1) The same exclusivity rule as adjacent partition root.
    2) The "cpuset.cpus" is not empty and all the CPUs must be
       present in "cpuset.cpus.reserve" of the root cgroup and none
       of them are allocated to another partition.
    3) The "cpuset.cpus" value must be present in all its ancestors
       to ensure proper hierarchical cpu distribution.

    External events like hotplug or changes to "cpuset.cpus" can
    cause a valid partition root to become invalid and vice versa.
    Note that a task cannot be moved to a cgroup with empty
    "cpuset.cpus.effective".

    For a valid partition root with the sibling cpu exclusivity
    rule enabled, changes made to "cpuset.cpus" that violate the
    exclusivity rule will invalidate the partition as well as its
    sibling partitions with conflicting cpuset.cpus values. So
    care must be taking in changing "cpuset.cpus".

    A valid non-root parent partition may distribute out all its CPUs
    to its child partitions when there is no task associated with it.

    Care must be taken to change a valid partition root to
    "member" as all its child partitions, if present, will become
    invalid causing disruption to tasks running in those child
    partitions. These inactivated partitions could be recovered if
    their parent is switched back to a partition root with a proper
    set of "cpuset.cpus".

    Poll and inotify events are triggered whenever the state of
    "cpuset.cpus.partition" changes.  That includes changes caused
    by write to "cpuset.cpus.partition", cpu hotplug or other
    changes that modify the validity status of the partition.
    This will allow user space agents to monitor unexpected changes
    to "cpuset.cpus.partition" without the need to do continuous
    polling.

Cheers,
Longman




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux