On Sat, 13 Jul 2024 at 16:05, Carlos Bilbao <carlos.bilbao.osdev@xxxxxxxxx> wrote: > > Add some documentation regarding the newly introduced scheduler EEVDF. > > Reviewed-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx> > Tested-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx> > Signed-off-by: Carlos Bilbao <carlos.bilbao.osdev@xxxxxxxxx> > --- > Documentation/scheduler/index.rst | 1 + > Documentation/scheduler/sched-design-CFS.rst | 10 +++-- > Documentation/scheduler/sched-eevdf.rst | 44 ++++++++++++++++++++ > 3 files changed, 51 insertions(+), 4 deletions(-) > create mode 100644 Documentation/scheduler/sched-eevdf.rst > > diff --git a/Documentation/scheduler/index.rst b/Documentation/scheduler/index.rst > index 43bd8a145b7a..1f2942c4d14b 100644 > --- a/Documentation/scheduler/index.rst > +++ b/Documentation/scheduler/index.rst > @@ -12,6 +12,7 @@ Scheduler > sched-bwc > sched-deadline > sched-design-CFS > + sched-eevdf > sched-domains > sched-capacity > sched-energy > diff --git a/Documentation/scheduler/sched-design-CFS.rst b/Documentation/scheduler/sched-design-CFS.rst > index bc1e507269c6..b703c6dcb3cd 100644 > --- a/Documentation/scheduler/sched-design-CFS.rst > +++ b/Documentation/scheduler/sched-design-CFS.rst > @@ -8,10 +8,12 @@ CFS Scheduler > 1. OVERVIEW > ============ > > -CFS stands for "Completely Fair Scheduler," and is the new "desktop" process > -scheduler implemented by Ingo Molnar and merged in Linux 2.6.23. It is the > -replacement for the previous vanilla scheduler's SCHED_OTHER interactivity > -code. > +CFS stands for "Completely Fair Scheduler," and is the "desktop" process > +scheduler implemented by Ingo Molnar and merged in Linux 2.6.23. When > +originally merged, it was the replacement for the previous vanilla > +scheduler's SCHED_OTHER interactivity code. Nowadays, CFS is making room > +for EEVDF, for which documentation can be found in > +:ref:`sched_design_EEVDF`. > > 80% of CFS's design can be summed up in a single sentence: CFS basically models > an "ideal, precise multi-tasking CPU" on real hardware. > diff --git a/Documentation/scheduler/sched-eevdf.rst b/Documentation/scheduler/sched-eevdf.rst > new file mode 100644 > index 000000000000..019327da333a > --- /dev/null > +++ b/Documentation/scheduler/sched-eevdf.rst > @@ -0,0 +1,44 @@ > +.. _sched_design_EEVDF: > + > +=============== > +EEVDF Scheduler > +=============== > + > +The "Earliest Eligible Virtual Deadline First" (EEVDF) was first introduced > +in a scientific publication in 1995 [1]. The Linux kernel began > +transitioning to EEVDF in version 6.6 (as a new option in 2024), moving > +away from the earlier Completely Fair Scheduler (CFS) in favor of a version > +of EEVDF proposed by Peter Zijlstra in 2023 [2-4]. More information > +regarding CFS can be found in :ref:`sched_design_CFS`. > + > +Similarly to CFS, EEVDF aims to distribute CPU time equally among all > +runnable tasks with the same priority. To do so, it assigns a virtual run > +time to each task, creating a "lag" value that can be used to determine > +whether a task has received its fair share of CPU time. In this way, a task > +with a positive lag is owed CPU time, while a negative lag means the task > +has exceeded its portion. EEVDF picks tasks with lag greater or equal to > +zero and calculates a virtual deadline (VD) for each, selecting the task > +with the earliest VD to execute next. It's important to note that this > +allows latency-sensitive tasks with shorter time slices to be prioritized, > +which helps with their responsiveness. > + > +There are ongoing discussions on how to manage lag, especially for sleeping > +tasks; but at the time of writing EEVDF uses a "decaying" mechanism based > +on virtual run time (VRT). This prevents tasks from exploiting the system > +by sleeping briefly to reset their negative lag: when a task sleeps, it > +remains on the run queue but marked for "deferred dequeue," allowing its > +lag to decay over VRT. Hence, long-sleeping tasks eventually have their lag > +reset. Finally, tasks can preempt others if their VD is earlier, and tasks > +can request specific time slices using the new sched_setattr() system call, > +which further facilitates the job of latency-sensitive applications. > + > +REFERENCES > +========== > + > +[1] https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=805acf7726282721504c8f00575d91ebfd750564 > + > +[2] https://lore.kernel.org/lkml/a79014e6-ea83-b316-1e12-2ae056bda6fa@xxxxxxxxxxxxxxxxxx/ > + > +[3] https://lwn.net/Articles/969062/ > + > +[4] https://lwn.net/Articles/925371/ > -- > 2.43.0 > Reviewed-by: Sergio González Collado <sergio.collado@xxxxxxxxx>