Re: yielding while running SCHED_DEADLINE

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Mon, 17 Sep 2018 13:42:19 +0200

On Mon, Sep 17, 2018 at 11:26:48AM +0200, Juri Lelli wrote:
> Hi,
> 
> On 14/09/18 23:13, Patel, Vedang wrote:
> > Hi all, 
> > 
> > We have been playing around with SCHED_DEADLINE and found some
> > discrepancy around the calculation of nr_involuntary_switches and
> > nr_voluntary_switches in /proc/${PID}/sched.
> > 
> > Whenever the task is done with it's work earlier and executes
> > sched_yield() to voluntarily gives up the CPU this increments
> > nr_involuntary_switches. It should have incremented
> > nr_voluntary_switches.
> 
> Mmm, I see what you are saying.
> 
> [...]
> 
> > Looking at __schedule() in kernel/sched/core.c, the switch is counted
> > as part of nr_involuntary_switches if the task has not been preempted
> > and the task is TASK_RUNNING state. This does not seem to happen when
> > sched_yield() is called.
> 
> Mmm,
> 
>  - nr_voluntary_switches++ if !preempt && !RUNNING
>  - nr_involuntary_switches++ otherwise (yield fits this as the task is
>    still RUNNING, even though throttled for DEADLINE)
> 
> Not sure this is the same as what you say above..
> 
> > Is there something we are missing over here? OR Is this a known issue
> > and is planned to be fixed later?
> 
> .. however, not sure. Peter, what you say. It looks like we might indeed
> want to account yield as a voluntary switch, seems to fit. In this case
> I guess we could use a flag or add a sched_ bit to task_struct to handle
> the case?

It's been like this _forever_ afaict. This isn't deadline specific
afaict, all yield callers will end up in non-voluntary switches.

I don't know anybody that cares and I don't think this is something
worth fixing. If someone did rely on this behaviour we'd break them, and
i'd much rather save a cycle than add more stupid stats crap to the
scheduler.