Hi Vineeth, On 20/05/2023 04:15, Vineeth Remanan Pillai wrote: > Hi Dietmar, > > On Fri, May 19, 2023 at 1:56 PM Dietmar Eggemann > <dietmar.eggemann@xxxxxxx> wrote: > >>> TID[730]: RECLAIM=1, (r=8ms, d=10ms, p=10ms), Util: 95.05 >>> TID[731]: RECLAIM=1, (r=1ms, d=10ms, p=10ms), Util: 31.34 >>> TID[732]: RECLAIM=1, (r=1ms, d=100ms, p=100ms), Util: 3.16 >> >> What does this 'Util: X' value stand for? I assume it's the utilization >> of the task? How do you obtain it? >> > Yes, it is the utilization of the task. I calculate it by dividing the > cputime with elapsed time(using clock_gettime(2)). Makes, sense, I guess what I missed here in the first place is the fact that those DL tasks want to run 100%. >> I see that e.g. TID[731] should run 1ms each 10ms w/o grub and with grub >> the runtime could be potentially longer since 'scaled_delta_exec < delta'. >> > Yes correct. GRUB(Greedy Reclamation of Unused Bandwidth) algorithm > is used here for deadline tasks that needs to run longer than their > runtime when needed. sched_setattr allows a flag SCHED_FLAG_RECLAIM > to indicate that the task would like to reclaim unused bandwidth of a > cpu if available. For those tasks, 'runtime' is depreciated using the > GRUB formula and it allows it to run for longer and reclaim the free > bandwidth of the cpu. The GRUB implementation in linux allows a task > to reclaim upto RT capacity(95%) and depends on the free bandwidth > of the cpu. So TID[731] theoretically should run for 95ms as it is > the only task in the cpu, but it doesn't get to run that long. Correct. >> I don't get this comment in update_curr_dl(): >> >> 1325 /* >> 1326 * For tasks that participate in GRUB, we implement GRUB-PA: the >> 1327 * spare reclaimed bandwidth is used to clock down frequency. >> 1328 * >> >> It looks like dl_se->runtime is affected and with 'scaled_delta_exec < >> delta' the task runs longer than dl_se->dl_runtime? >> > Yes. As mentioned above, GRUB allows the task to run longer by slowing > down the depreciation of "dl_se->dl_runtime". scaled_delta_exec is > calculated by the GRUB formula explained in the paper [1] & [2]. What I didn't understand was this `GRUB-PA` and `the spare reclaimed bandwidth is used to clock down frequency` in relation to GRUB task runtime depreciation. But now I think I get it. `GRUB-PA` means that in case we run with the schedutil CPUfreq governor, the CPU frequency is influenced by Uact (rq->dl.running_bw) via: sugov_get_util() -> effective_cpu_util() -> cpu_bw_dl() -> return rq->dl.running_bw * SCHED_CAPACITY_SCALE) >> BW_SHIFT and on top of this we do GRUB reclaiming for those SCHED_FLAG_RECLAIM tasks, i.e. task runtime depreciation. >> I did the test discussed later in this thread with: >> >> 3 [3/100] tasks (dl_se->dl_bw = (3 << 20)/100 = 31457) on 3 CPUs >> >> factor = scaled_delta_exec/delta >> >> - existing grub >> >> rq->dl.bw_ratio = ( 100 << 8 ) / 95 = 269 >> rq->dl.extra_bw = ( 95 << 20 ) / 100 = 996147 >> >> cpu=2 curr->[thread0-2 1715] delta=2140100 this_bw=31457 >> running_bw=31457 extra_bw=894788 u_inact=0 u_act_min=33054 u_act=153788 >> scaled_delta_exec=313874 factor=0.14 >> >> - your solution patch [1-2] >> >> cpu=2 curr->[thread0-0 1676] delta=157020 running_bw=31457 max_bw=996147 >> res=4958 factor=0.03 >> >> You say that GRUB calculation is inaccurate and that this inaccuracy >> gets larger as the bandwidth of tasks becomes smaller. >> >> Could you explain this inaccuracy on this example? >> > According to GRUB, we should be able to reclaim the unused bandwidth > for the running task upto RT limits(95%). In this example we have a > task with 3ms runtime and 100ms runtime on a cpu. So it is supposed > to run for 95ms before it is throttled. Correct. > Existing implementation's factor = 0.14 and 3ms is depreciated by > this factor. So it gets to run for "3 / 0.14 ~= 22ms". This is the > inaccuracy that the patch is trying to solve. With the patch, the > factor is .03166 and runtime = "3 / 0.03166 ~= 95ms" My tests were wrong since I was using DL task with dl_runtime=3ms and dl_period = 100ms with an actual runtime=3ms whereas your tasks probably want to run 100%. > Hope this clarifies. yes, it did, thanks!