On 11/13/24 1:11 PM, Juri Lelli wrote:
On 13/11/24 11:50, Waiman Long wrote:
On 11/13/24 11:42 AM, Waiman Long wrote:
On 11/13/24 11:40 AM, Juri Lelli wrote:
On 13/11/24 11:06, Waiman Long wrote:
...
This part can still cause a failure in one of test cases in my cpuset
partition test script. In this particular case, the CPU to be
offlined is an
isolated CPU with scheduling disabled. As a result, total_bw is
0 and the
__dl_overflow() test failed. Is there a way to skip the
__dl_overflow() test
for isolated CPUs? Can we use a null total_bw as a proxy for that?
Can you please share the repro script? Would like to check locally what
is going on.
Just run tools/testing/selftests/cgroup/test_cpuset_prs.sh.
The failing test is
# Remote partition offline tests
" C0-3:S+ C1-3:S+ C2-3 . X2-3 X2-3 X2-3:P2:O2=0 . 0
A1:0-1,A2:1,A3:3 A1:P0,A3:P2 2-3"
You can remove all the previous lines in the TEST_MATRIX to get to failed
test case immediately eliminating unnecessary noise in your testing.
So, IIUC this test is doing the following
# echo +cpuset >cgroup/cgroup.subtree_control
# mkdir cgroup/A1
# echo 0-3 >cgroup/A1/cpuset.cpus
# echo +cpuset >cgroup/A1/cgroup.subtree_control
# mkdir cgroup/A1/A2
# echo 1-3 >cgroup/A1/A2/cpuset.cpus
# echo +cpuset >cgroup/A1/A2/cgroup.subtree_control
# mkdir cgroup/A1/A2/A3
# echo 2-3 >cgroup/A1/A2/A3/cpuset.cpus
# echo 2-3 >cgroup/A1/cpuset.cpus.exclusive
# echo 2-3 >cgroup/A1/A2/cpuset.cpus.exclusive
# echo 2-3 >cgroup/A1/A2/A3/cpuset.cpus.exclusive
# echo isolated >cgroup/A1/A2/A3/cpuset.cpus.partition
With the last command, we get to one root domain with span: 0-1,4-7 (in
my setup with 8 CPUs) and no root domain for 2,3, since they are
isolated.
The test then tries to hotplug CPU 2, but fails to do so and so the
reported error.
total_bw for CPU 2 and CPU 3 is indeed 0, and I guess we could special
case this as you suggest (nothing to really worry about if we don't have
DEADLINE tasks affined to these CPUs). But I would have expected the
fair server contribution to still show up in total_bw, so this is
something a need to check.
Thanks for looking into this. So the test script does create a lot of
different corner cases to test the correctness of the cpuset partition
code. Hopefully that will help you to improve the DL code to better
handle these corner cases.
Cheers,
Longman