On 11/14/24 9:28 AM, Juri Lelli wrote:
Hello!
v2 of a patch series [3] that addresses two issues affecting DEADLINE
bandwidth accounting during non-destructive changes to root domains and
hotplug operations. The series is based on top of Waiman's
"cgroup/cpuset: Remove redundant rebuild_sched_domains_locked() calls"
series [1] which is now merged into cgroups/for-6.13 (this series is
based on top of that, commit c4c9cebe2fb9). The discussion that
eventually led to these two series can be found at [2].
Waiman reported that v1 still failed to make his test_cpuset_prs.sh
happy, so I had to change both patches a little. It now seems to pass on
my runs.
Patch 01/02 deals with non-destructive root domain changes. With respect
to v1 we now always restore dl_server contributions, considering root
domain span and active cpus mask (otherwise accounting on the default
root domain would end up to be incorrect).
Patch 02/02 deals with hotplug. With respect to v1 I added special
casing when total_bw = 0 (so no DEADLINE tasks to consider) and when a
root domain is left with no cpus due to hotplug.
In all honesty, I still see intermittent issues that seems to however be
related to the dance we do in sched_cpu_deactivate(), where we first
turn everything related to a cpu/rq off and revert that if
cpuset_cpu_inactive() reveals failing DEADLINE checks. But, since these
seem to be orthogonal to the original discussion we started from, I
wanted to send this out as an hopefully meaningful update/improvement
since yesterday. Will continue looking into this.
Please go forth and test/review.
Series also available at
git@xxxxxxxxxx:jlelli/linux.git upstream/dl-server-apply
Best,
Juri
[1] https://lore.kernel.org/lkml/20241110025023.664487-1-longman@xxxxxxxxxx/
[2] https://lore.kernel.org/lkml/20241029225116.3998487-1-joel@xxxxxxxxxxxxxxxxx/
[3] v1 - https://lore.kernel.org/lkml/20241113125724.450249-1-juri.lelli@xxxxxxxxxx/
Juri Lelli (2):
sched/deadline: Restore dl_server bandwidth on non-destructive root
domain changes
sched/deadline: Correctly account for allocated bandwidth during
hotplug
kernel/sched/core.c | 2 +-
kernel/sched/deadline.c | 65 +++++++++++++++++++++++++++++++++--------
kernel/sched/sched.h | 2 +-
kernel/sched/topology.c | 8 +++--
4 files changed, 60 insertions(+), 17 deletions(-)
Thanks for this new patch series. I have confirmed that with some minor
twisting of the cpuset code, all the test cases in the
test_cpuset_prs.sh script passed.
Tested-by: Waiman Long <longman@xxxxxxxxxx>