Re: [PATCH] sched: Don't try to catch up excess steal time.

Suleiman Souhlal <suleiman@xxxxxxxxxx> · Tue, 20 Aug 2024 10:46:11 +0900

On Tue, Aug 6, 2024 at 8:13 PM Suleiman Souhlal <suleiman@xxxxxxxxxx> wrote:
>
> When steal time exceeds the measured delta when updating clock_task, we
> currently try to catch up the excess in future updates.
> However, this results in inaccurate run times for the future clock_task
> measurements, as they end up getting additional steal time that did not
> actually happen, from the previous excess steal time being paid back.
>
> For example, suppose a task in a VM runs for 10ms and had 15ms of steal
> time reported while it ran. clock_task rightly doesn't advance. Then, a
> different task runs on the same rq for 10ms without any time stolen.
> Because of the current catch up mechanism, clock_sched inaccurately ends
> up advancing by only 5ms instead of 10ms even though there wasn't any
> actual time stolen. The second task is getting charged for less time
> than it ran, even though it didn't deserve it.
> In other words, tasks can end up getting more run time than they should
> actually get.
>
> So, we instead don't make future updates pay back past excess stolen time.
>
> Signed-off-by: Suleiman Souhlal <suleiman@xxxxxxxxxx>

Gentle ping.

-- Suleiman