Hi, On Fri, 19 May 2023 12:12:50 -0400 Vineeth Remanan Pillai <vineeth@xxxxxxxxxxxxxxx> wrote: [...] > With both these equations, it doesn't solve couple of other issues we > had discussed before: > - tasks with different bandwidth reclaims differently even when #tasks > is less than #cpus. I think I now understand this issue (see below) > - cpu util may go to 100% when we have tasks with large bandwidth > close to Umax This one is still not clear to me... I'll do some more analysis. > As an eg. for issue 1, three tasks - (7,10) (3,10) and (1,10): > TID[590]: RECLAIM=1, (r=7ms, d=10ms, p=10ms), Util: 95.20 > TID[591]: RECLAIM=1, (r=3ms, d=10ms, p=10ms), Util: 81.94 > TID[592]: RECLAIM=1, (r=1ms, d=10ms, p=10ms), Util: 27.19 So, the issue here is that GRUB tries to assign the reclaimed utilization proportionally to the task's utilizations... So, 591 should execute 3 times the amount of time executed by 592, and 590 should execute 7 times the amount of time executed by 592. Task 592 is then supposed to execute for 95 / (1 + 3 + 7) = 95 / 11 = 8.63% of the CPU time; task 591 is supposed to execute for 8.63 * 3 = 25.9% of the CPU time; task 590 is supposed to execute for 8.63 * 7 = 60.64% of the CPU time. So, 592 executes for 8.63 * 3 = 25.9% of the time on one single CPU (you measured 27.19, but this is nead), task 591 executes for 25.9 * 3 = 77.72% of the time on one single CPU (again, this is close to what you measured) and task 590 should execute for... 60.64 * 3 = 181.3% of the time on one single CPU! Which is clearly not possible... And the "max{}" rule cuts this to 95%. So, we are wasting 181.3 - 95 = 86.3% of CPU time, which 590 cannot reclaim (because it cannot execute simultaneously on 2 CPUs). And this is close to the amount of CPU time not reclaimed in the test you cite above (95 - 81 + 95 - 27) Now that the problem is more clear to me, I am trying to understand a possible solution (as you mention, moving some extra bandwidth from the 590's CPU will fix this problem... But I am not sure if this dynamic extra bandwidth migration is feasible in practice without introducing too much overhead) I'll look better at your new proposal. Thanks, Luca