Re: [tip: sched/core] sched/eevdf: More PELT vs DELAYED_DEQUEUE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 11/27/24 04:17, K Prateek Nayak wrote:
> Hello Peter,
> 
> On 9/10/2024 1:39 PM, tip-bot2 for Peter Zijlstra wrote:
>> The following commit has been merged into the sched/core branch of tip:
>>
>> Commit-ID:     2e05f6c71d36f8ae1410a1cf3f12848cc17916e9
>> Gitweb:        https://git.kernel.org/tip/2e05f6c71d36f8ae1410a1cf3f12848cc17916e9
>> Author:        Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> AuthorDate:    Fri, 06 Sep 2024 12:45:25 +02:00
>> Committer:     Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> CommitterDate: Tue, 10 Sep 2024 09:51:15 +02:00
>>
>> sched/eevdf: More PELT vs DELAYED_DEQUEUE
>>
>> Vincent and Dietmar noted that while commit fc1892becd56 fixes the
>> entity runnable stats, it does not adjust the cfs_rq runnable stats,
>> which are based off of h_nr_running.
>>
>> Track h_nr_delayed such that we can discount those and adjust the
>> signal.
>>
>> Fixes: fc1892becd56 ("sched/eevdf: Fixup PELT vs DELAYED_DEQUEUE")
>> Reported-by: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
>> Reported-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
>> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
>> Link: https://lkml.kernel.org/r/20240906104525.GG4928@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> 
> I've been testing this fix for a while now to see if it helps the
> regressions reported around EEVDF complete. The issue with negative
> "h_nr_delayed" reported by Luis previously seem to have been fixed as a
> result of commit 75b6499024a6 ("sched/fair: Properly deactivate
> sched_delayed task upon class change")

I recall having 75b6499024a6 in my testing tree and somehow still noticing
unbalanced accounting for h_nr_delayed, where it would be decremented
twice eventually, leading to negative numbers.

I might have to give it another go if we're considering including the change
as-is, just to make sure. Since our setups are slightly different, we could
be exercising some slightly different paths.

Did this patch help with the regressions you noticed though? 

> 
> I've been running stress-ng for a while and haven't seen any cases of
> negative "h_nr_delayed". I'd also added the following WARN_ON() to see
> if there are any delayed tasks on the cfs_rq before switching to idle in
> some of my previous experiments and I did not see any splat during my
> benchmark runs.
> 
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index 621696269584..c19a31fa46c9 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -457,6 +457,9 @@ static void put_prev_task_idle(struct rq *rq, struct task_struct *prev, struct t
>  
>  static void set_next_task_idle(struct rq *rq, struct task_struct *next, bool first)
>  {
> +    /* All delayed tasks must be picked off before switching to idle */
> +    SCHED_WARN_ON(rq->cfs.h_nr_delayed);
> +
>      update_idle_core(rq);
>      scx_update_idle(rq, true);
>      schedstat_inc(rq->sched_goidle);
> -- 
> 
> If you are including this back, feel free to add:
> 
> Tested-by: K Prateek Nayak <kprateek.nayak@xxxxxxx>
> 
>> [..snip..]
> 





[Index of Archives]     [Linux Stable Commits]     [Linux Stable Kernel]     [Linux Kernel]     [Linux USB Devel]     [Linux Video &Media]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux