On 2/1/23 16:10, Peter Zijlstra wrote:
On Wed, Feb 01, 2023 at 01:46:11PM -0500, Waiman Long wrote:
Note that using cpus_allowed directly in cgroup v2 may not be right because
cpus_allowed may have no relationship to effective_cpus at all in some
cases, e.g.
root
|
V
A (cpus_allowed = 1-4, effective_cpus = 1-4)
|
V
B (cpus_allowed = 5-8, effective_cpus = 1-4)
In the case of cpuset B, passing back cpus 5-8 as the allowed_cpus is wrong.
I think my patch as written does the right thing here. Since the
intersection of (1-4) and (5-8) is empty it will move up the hierarchy
and we'll end up with (1-4) from the cgroup side of things.
So the purpose of __cs_cpus_allowed() is to override the cpus_allowed of
the root set and force it to cpu_possible_mask.
Then cs_cpus_allowed() computes the intersection of cs->cpus_allowed and
all it's parents. This will, in the case of B above, result in the empty
mask.
Then cpuset_cpus_allowed() has a loop that starts with
task_cpu_possible_mask(), intersects that with cs_cpus_allowed() and if
the intersection of that and cpu_online_mask is empty, moves up the
hierarchy. Given cs_cpus_allowed(B) is the empty mask, we'll move to A.
Note that since we force the mask of root to cpu_possible_mask,
cs_cpus_allowed(root) will be a no-op and if we guarantee (in arch code)
that cpu_online_mask always has a non-empty intersection with
task_cpu_possible_mask(), this loop is guaranteed to terminate with a
viable mask.
I will take a closer look at that tomorrow. I will be more comfortable
ack'ing that if this is specific to v1 cpuset instead of applying this
in both v1 and v2 since it is only v1 that is problematic.
Cheers,
Longman