Hi Luca, On Mon, May 15, 2023 at 4:06 AM luca abeni <luca.abeni@xxxxxxxxxxxxxxx> wrote: > > this patch is giving me some headaches: > Sorry about that.. I was also stressing out on how to get the reclaiming done right for the past couple of days ;-) > Vineeth Pillai <vineeth@xxxxxxxxxxxxxxx> wrote: > [...] > > * Uextra: Extra bandwidth not reserved: > > - * = Umax - \Sum(u_i / #cpus in the root domain) > > + * = Umax - this_bw > > While I agree that this setting should be OK, it ends up with > dq = -Uact / Umax * dt > which I remember I originally tried, and gave some issues > (I do not remember the details, but I think if you try N > identical reclaiming tasks, with N > M, the reclaimed time > is not distributed equally among them?) > I have noticed this behaviour where the reclaimed time is not equally distributed when we have more tasks than available processors. But it depended on where the task was scheduled. Within the same cpu, the distribution seemed to be proportional. But the tasks migrated often and then depending on whether the task got a whole cpu for its runtime or not, the reclaimed bandwidth differed. I thought that should be okay as it depended upon where the task landed. One other problem I saw was cpu usage spiking above max_bw leading to system hang sometimes. I thought stopping reclaiming when running_bw gets larger than max_bw(in 4th patch) fixed this, but when I ran the tests long enough, I did see this hang. > I need to think a little bit more about this... > Thanks for looking into this.. I have a basic idea why tasks with less bandwidth reclaim less in SMP when number of tasks is less than number of cpus, but do not yet have a verifiable fix for it. If patches 1 and 4 looks good to you, we shall drop 2 and 3 and fix the SMP issue with varying bandwidth separately.. Patch 4 would differ a bit when I remove 2 and 3 so as to use the formula: "dq = -(max{u, (Umax_reclaim - Uinact - Uextra)} / Umax_reclaim) dt" Thanks for your patience with all these brainstorming:-) Thanks, Vineeth