Hi, > I wanted to say one v5.12-rcX version to make sure this is still a > valid problem on latest version Ahh, I see. No problem. :) Thank you so much for taking the time to look at this! > I confirm that I can see a ratio of 4ms vs 204ms running time with the > patch below. (I assume you talk about the bash code for reproducing, not the actual sched patch.) > But when I look more deeply in my trace (I have > instrumented the code), it seems that the 2 stress-ng don't belong to > the same cgroup but remained in cg-1 and cg-2 which explains such > running time difference. (mail reply number two to your previous mail might also help surface it) Not sure if I have stated it correctly, or if we are talking about the same thing. It _is_ the intention that the two procs should not be in the same cgroup. In the same way as people create "containers", each proc runs in a separate cgroup in the example. The issue is not the balancing between the procs themselves, but rather cgroups/sched_entities inside the cgroup hierarchy. (due to the fact that the vruntime of those sched_entities end up being calculated with more load than they are supposed to). If you have any thought about the phrasing of the patch itself to make it easier to understand, feel free to suggest. Given the last cgroup v1 script, I get this: - cat /proc/<stress-pid-1>/cgroup | grep cpu 11:cpu,cpuacct:/slice/cg-1/sub 3:cpuset:/slice - cat /proc/<stress-pid-2>/cgroup | grep cpu 11:cpu,cpuacct:/slice/cg-2/sub 3:cpuset:/slice The cgroup hierarchy will then roughly be like this (using cgroup v2 terms, becuase I find them easier to reason about): slice/ cg-1/ cpu.shares: 100 sub/ cpu.weight: 1 cpuset.cpus: 1 cgroup.procs - stress process 1 here cg-2/ cpu.weight: 100 sub/ cpu.weight: 10000 cpuset.cpus: 1 cgroup.procs - stress process 2 here This should result in 50/50 due to the fact that cg-1 and cg-2 both have a weight of 100, and "live" inside the /slice cgroup. The inner weight should not matter, since there is only one cgroup at that level. > So your script doesn't reproduce the bug you > want to highlight. That being said, I can also see a diff between the > contrib of the cpu0 in the tg_load. I'm going to look further There can definitely be some other issues involved, and I am pretty sure you have way more knowledge about the scheduler than me... :) However, I am pretty sure that it is in fact showing the issue I am talking about, and applying the patch does indeed make it impossible to reproduce it on my systems. Odin