On 19/02/25 10:29, Dietmar Eggemann wrote: ... > I did now. Thanks! > Patch-wise I have: > > (1) Putting 'fair_server's __dl_server_[de|at]tach_root() under if > '(cpumask_test_cpu(rq->cpu, [old_rd->online|cpu_active_mask))' in > rq_attach_root() > > https://lkml.kernel.org/r/Z7RhNmLpOb7SLImW@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx > > (2) Create __dl_server_detach_root() and call it in rq_attach_root() > > https://lkml.kernel.org/r/Z4fd_6M2vhSMSR0i@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx > > plus debug patch: > > https://lkml.kernel.org/r/Z6M5fQB9P1_bDF7A@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx > > plus additional debug. So you don't have the one with which we ignore special tasks while rebuilding domains? https://lore.kernel.org/all/Z6spnwykg6YSXBX_@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ Could you please double check again against git@xxxxxxxxxx:jlelli/linux.git experimental/dl-debug > The suspend issue still persists. > > My hunch is that it's rather an issue with having 0 CPUs left in DEF > while deactivating the last isol CPU (CPU3) so we set overflow = 1 w/o > calling __dl_overflow(). We want to account fair_server_bw=52428 > against 0 CPUs. > > l B B l l l > > ^^^ > isolcpus=[3,4] > > > cpumask_and(mask, rd->span, cpu_active_mask) > > mask = [3-5] & [0-3] = [3] -> dl_bw_cpus(3) = 1 > > --- > > dl_bw_deactivate() called cpu=5 > > dl_bw_deactivate() called cpu=4 > > dl_bw_deactivate() called cpu=3 > > dl_bw_cpus() cpu=6 rd->span=3-5 cpu_active_mask=0-3 cpus=1 type=DEF > ^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^ > cpumask_subset(rd->span, cpu_active_mask) is false > > for_each_cpu_and(i, rd->span, cpu_active_mask) > cpus++ <-- cpus is 1 !!! > > dl_bw_manage: cpu=3 cap=0 fair_server_bw=52428 total_bw=104856 dl_bw_cpus=1 type=DEF span=3-5 ^^^^^^ This still looks wrong: with a single cpu remaining we should only have the corresponding dl server bandwidth present (unless there is some other DL task running. If you already had the patch ignoring sugovs bandwidth in your set, could you please share the full dmesg? Thanks!