Hello Abel (sorry for my delay). On Wed, Jan 29, 2025 at 12:48:09PM +0800, Abel Wu <wuyun.abel@xxxxxxxxxxxxx> wrote: > PSI tracks stall times for each cpu, and > > tSOME[cpu] = time(nr_delayed_tasks[cpu] != 0) > > which turns nr_delayed_tasks[cpu] into boolean value, hence loses > insight into how severely this task group is stalled on this cpu. Thanks for example. So the lost information is kind of a group load. What meaning it has when there is no group throttling? Honestly, I can't reason neither about PSI.some nor Σ run_delay wrt feedback for resource control. What it is slightly bugging me is introduction of another stats field before first one was explored :-) But if there's information loss with PSI -- could cpu.pressure:some be removed in favor of Σ run_delay? (The former could be calculated from latter if you're right :-p) (I didn't like the before/after shuffling with enum cpu_usage_stat NR_STATS but I saw v4 where you tackled that.) Michal More context form previous message, the difference is between a) and c), or better equal lanes: a') t1 |----| t2 |xx--| t3 |----| c) t1 |----| t2 |xx--| t3 |xx--| <-Δt-> run_delay can be calculated indepently of cpu.pressure:some because there is still difference between a') and c) in terms of total cpu usage. Δrun_delay = nr * Δt - Δusage The challenge is with nr (assuming they're all runnable during Δt), that would need to be sampled from /sys/kernel/debug/sched/debug. But then you can get whatever load for individual cfs_rqs from there. Hm, does it even make sense to add up run_delays from different CPUs?