Re: Additional issue with cpuset isolated partitions?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/15/24 11:30 AM, Juri Lelli wrote:
Hello,

While working on the recent cpuset/deadline fixes [1], I encountered
what looks like an issue to me. What I'm doing is (based on one of the
tests of test_cpuset_prs.sh):

# echo Y >/sys/kernel/debug/sched/verbose
# echo +cpuset >cgroup/cgroup.subtree_control
# mkdir cgroup/A1
# echo 0-3 >cgroup/A1/cpuset.cpus
# echo +cpuset >cgroup/A1/cgroup.subtree_control
# mkdir cgroup/A1/A2
# echo 1-3 >cgroup/A1/A2/cpuset.cpus
# echo +cpuset >cgroup/A1/A2/cgroup.subtree_control
# mkdir cgroup/A1/A2/A3
# echo 2-3 >cgroup/A1/A2/A3/cpuset.cpus
# echo 2-3 >cgroup/A1/cpuset.cpus.exclusive
# echo 2-3 >cgroup/A1/A2/cpuset.cpus.exclusive
# echo 2-3 >cgroup/A1/A2/A3/cpuset.cpus.exclusive
# echo isolated >cgroup/A1/A2/A3/cpuset.cpus.partition

and with this, on my 8 CPUs system, I correctly get a root domain for
0-1,4-7 and 2,3 are left isolated (attached to default root domain).

I now put the shell into the A1/A2/A3 cpuset

# echo $$ >cgroup/A1/A2/A3/cgroup.procs

and hotplug CPU 2,3

# echo 0 >/sys/devices/system/cpu/cpu2/online
# echo 0 >/sys/devices/system/cpu/cpu3/online

guess the shell is moved to the non-isolated domain. So far so good
then, only that if I turn CPUs 2,3 back on they are attached to the root
domain containing the non-isolated cpus
A valid partition must have CPUs associated with it. If no CPU is available, it becomes invalid and fall back to use the CPUs from the parent cgroup.

# echo 1 >/sys/devices/system/cpu/cpu2/online
...
[  990.133593] root domain span: 0-2,4-7
[  990.134480] rd 0-2,4-7

# echo 1 >/sys/devices/system/cpu/cpu3/online
...
[ 1082.858992] root domain span: 0-7
[ 1082.859530] rd 0-7

And now the A1/A2/A3 partition is not valid anymore

# cat cgroup/A1/A2/A3/cpuset.cpus.partition
isolated invalid (Invalid cpu list in cpuset.cpus.exclusive)

Is this expected? It looks like one need to put at least one process in
the partition before hotplugging its cpus for the above to reproduce
(hotpluging w/o processes involved leaves CPUs 2,3 in the default domain
and isolated).

Once a partition becomes invalid, there is no self recovery if the CPUs become online again. Users have to explicitly re-enable it. It is really a very rare case and so we don't spend effort to do that.

If only one of 2 CPUs are offline and then online again, the full 2-CPU isolated partition can be recovered.

Please let me know if you have further question.

Cheers,
Longman





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux