On 19/02/2025 11:02, Juri Lelli wrote: > On 19/02/25 10:29, Dietmar Eggemann wrote: [...] > So you don't have the one with which we ignore special tasks while > rebuilding domains? > > https://lore.kernel.org/all/Z6spnwykg6YSXBX_@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > > Could you please double check again against > > git@xxxxxxxxxx:jlelli/linux.git experimental/dl-debug Sorry, I forgot this one. Yes, I have it as well. 2993 void dl_add_task_root_domain(struct task_struct *p) 2994 { 2995 struct rq_flags rf; 2996 struct rq *rq; 2997 struct dl_bw *dl_b; 2998 2999 raw_spin_lock_irqsave(&p->pi_lock, rf.flags); 3000 if (!dl_task(p) || dl_entity_is_special(&p->dl)) { ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 3001 raw_spin_unlock_irqrestore(&p->pi_lock, rf.flags); 3002 return; 3003 } >> The suspend issue still persists. >> >> My hunch is that it's rather an issue with having 0 CPUs left in DEF >> while deactivating the last isol CPU (CPU3) so we set overflow = 1 w/o >> calling __dl_overflow(). We want to account fair_server_bw=52428 >> against 0 CPUs. >> >> l B B l l l >> >> ^^^ >> isolcpus=[3,4] >> >> >> cpumask_and(mask, rd->span, cpu_active_mask) >> >> mask = [3-5] & [0-3] = [3] -> dl_bw_cpus(3) = 1 >> >> --- >> >> dl_bw_deactivate() called cpu=5 >> >> dl_bw_deactivate() called cpu=4 >> >> dl_bw_deactivate() called cpu=3 >> >> dl_bw_cpus() cpu=6 rd->span=3-5 cpu_active_mask=0-3 cpus=1 type=DEF >> ^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^ >> cpumask_subset(rd->span, cpu_active_mask) is false >> >> for_each_cpu_and(i, rd->span, cpu_active_mask) >> cpus++ <-- cpus is 1 !!! >> >> dl_bw_manage: cpu=3 cap=0 fair_server_bw=52428 total_bw=104856 dl_bw_cpus=1 type=DEF span=3-5 > ^^^^^^ > This still looks wrong: with a single cpu remaining we should only have > the corresponding dl server bandwidth present (unless there is some > other DL task running. That's true. '104856 - 52428 = 52428' so util of 51 ? Which is 50% of a sugov task? Or exactly the fair_server_bw. But the bw numbers don't matter here since we go straight into the else path since dl_bw_cpus(3) = 1. 3587 if (dl_bw_cpus(cpu) - 1) 3588 overflow = __dl_overflow(dl_b, cap, fair_server_bw, 0); 3589 else 3590 overflow = 1; > If you already had the patch ignoring sugovs bandwidth in your set, could > you please share the full dmesg? Will do later today ... busy with other stuff right now ;-( BTW, I just saw that this issue also happens for me w/o sugov threads (running with Performance CPUfreq governor)! So the remaining 'total_bw=104856' must be the contribution from 2 CPUs of DEF. Maybe we just have a CPU-offset in this accounting somewhere during suspend?