Re: [PATCH v2 3/2] sched/deadline: Check bandwidth overflow earlier for hotplug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 19/02/25 10:29, Dietmar Eggemann wrote:

...

> I did now. 

Thanks!

> Patch-wise I have:
> 
> (1) Putting 'fair_server's __dl_server_[de|at]tach_root() under if  
>     '(cpumask_test_cpu(rq->cpu, [old_rd->online|cpu_active_mask))' in 
>     rq_attach_root()
> 
>     https://lkml.kernel.org/r/Z7RhNmLpOb7SLImW@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> 
> (2) Create __dl_server_detach_root() and call it in rq_attach_root()
> 
>     https://lkml.kernel.org/r/Z4fd_6M2vhSMSR0i@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> 
> plus debug patch:
> 
>     https://lkml.kernel.org/r/Z6M5fQB9P1_bDF7A@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> 
> plus additional debug.

So you don't have the one with which we ignore special tasks while
rebuilding domains?

https://lore.kernel.org/all/Z6spnwykg6YSXBX_@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

Could you please double check again against

git@xxxxxxxxxx:jlelli/linux.git experimental/dl-debug

> The suspend issue still persists.
> 
> My hunch is that it's rather an issue with having 0 CPUs left in DEF
> while deactivating the last isol CPU (CPU3) so we set overflow = 1 w/o
> calling __dl_overflow(). We want to account fair_server_bw=52428
> against 0 CPUs. 
> 
> l B B l l l
> 
>       ^^^
>       isolcpus=[3,4]
> 
> 
> cpumask_and(mask, rd->span, cpu_active_mask)
> 
> mask = [3-5] & [0-3] = [3] -> dl_bw_cpus(3) = 1
> 
> ---
> 
> dl_bw_deactivate() called cpu=5
> 
> dl_bw_deactivate() called cpu=4
> 
> dl_bw_deactivate() called cpu=3
> 
> dl_bw_cpus() cpu=6 rd->span=3-5 cpu_active_mask=0-3 cpus=1 type=DEF
>                    ^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^
>   cpumask_subset(rd->span, cpu_active_mask) is false
> 
>   for_each_cpu_and(i, rd->span, cpu_active_mask)
>     cpus++                                       <-- cpus is 1 !!!
> 
> dl_bw_manage: cpu=3 cap=0 fair_server_bw=52428 total_bw=104856 dl_bw_cpus=1 type=DEF span=3-5
                                                          ^^^^^^
This still looks wrong: with a single cpu remaining we should only have
the corresponding dl server bandwidth present (unless there is some
other DL task running.

If you already had the patch ignoring sugovs bandwidth in your set, could
you please share the full dmesg?

Thanks!





[Index of Archives]     [ARM Kernel]     [Linux ARM]     [Linux ARM MSM]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux