On 5/2/23 14:01, Michal Koutný wrote:
Hello.
The previous thread arrived incomplete to me, so I respond to the last
message only. Point me to a message URL if it was covered.
On Fri, Apr 14, 2023 at 03:06:27PM -0400, Waiman Long <longman@xxxxxxxxxx> wrote:
Below is a draft of the new cpuset.cpus.reserve cgroupfs file:
cpuset.cpus.reserve
A read-write multiple values file which exists on all
cpuset-enabled cgroups.
It lists the reserved CPUs to be used for the creation of
child partitions. See the section on "cpuset.cpus.partition"
below for more information on cpuset partition. These reserved
CPUs should be a subset of "cpuset.cpus" and will be mutually
exclusive of "cpuset.cpus.effective" when used since these
reserved CPUs cannot be used by tasks in the current cgroup.
There are two modes for partition CPUs reservation -
auto or manual. The system starts up in auto mode where
"cpuset.cpus.reserve" will be set automatically when valid
child partitions are created and users don't need to touch the
file at all. This mode has the limitation that the parent of a
partition must be a partition root itself. So child partition
has to be created one-by-one from the cgroup root down.
To enable the creation of a partition down in the hierarchy
without the intermediate cgroups to be partition roots,
Why would be this needed? Owning a CPU (a resource) must logically be
passed all the way from root to the target cgroup, i.e. this is
expressed by valid partitioning down to given level.
one
has to turn on the manual reservation mode by writing directly
to "cpuset.cpus.reserve" with a value different from its
current value. By distributing the reserve CPUs down the cgroup
hierarchy to the parent of the target cgroup, this target cgroup
can be switched to become a partition root if its "cpuset.cpus"
is a subset of the set of valid reserve CPUs in its parent.
level n
`- level n+1
cpuset.cpus // these are actually configured by "owner" of level n
cpuset.cpus.partition // similrly here, level n decides if child is a partition
I.e. what would be level n/cpuset.cpus.reserve good for when it can
directly control level n+1/cpuset.cpus?
In the new scheme, the available cpus are still directly passed down to
a descendant cgroup. However, isolated CPUs (or more generally CPUs
dedicated to a partition) have to be exclusive. So what the
cpuset.cpus.reserve does is to identify those exclusive CPUs that can be
excluded from the effective_cpus of the parent cgroups before they are
claimed by a child partition. Currently this is done automatically when
a child partition is created off a parent partition root. The new scheme
will break it into 2 separate steps without the requirement that the
parent of a partition has to be a partition root itself.
Cheers,
Longman
claimed by a partition and will be excluded from the effective_cpus of
the parent