Hello, On 7/12/24 23:22, Randy Dunlap wrote: > Hi, > > On 7/12/24 5:32 PM, Carlos Bilbao wrote: >> Add some documentation regarding the newly introduced scheduler EEVDF. >> >> Signed-off-by: Carlos Bilbao <carlos.bilbao.osdev@xxxxxxxxx> >> --- >> Documentation/scheduler/index.rst | 1 + >> Documentation/scheduler/sched-design-CFS.rst | 10 +++-- >> Documentation/scheduler/sched-eevdf.rst | 44 ++++++++++++++++++++ >> 3 files changed, 51 insertions(+), 4 deletions(-) >> create mode 100644 Documentation/scheduler/sched-eevdf.rst >> >> diff --git a/Documentation/scheduler/index.rst b/Documentation/scheduler/index.rst >> index 43bd8a145b7a..444a6fef1464 100644 >> --- a/Documentation/scheduler/index.rst >> +++ b/Documentation/scheduler/index.rst >> @@ -11,6 +11,7 @@ Scheduler >> sched-arch >> sched-bwc >> sched-deadline >> + sched-eevdf > > I would have probably put EEVDF just after CFS instead of before it... > whatever. > >> sched-design-CFS >> sched-domains >> sched-capacity >> diff --git a/Documentation/scheduler/sched-design-CFS.rst b/Documentation/scheduler/sched-design-CFS.rst >> index bc1e507269c6..b703c6dcb3cd 100644 >> --- a/Documentation/scheduler/sched-design-CFS.rst >> +++ b/Documentation/scheduler/sched-design-CFS.rst >> @@ -8,10 +8,12 @@ CFS Scheduler >> 1. OVERVIEW >> ============ >> >> -CFS stands for "Completely Fair Scheduler," and is the new "desktop" process >> -scheduler implemented by Ingo Molnar and merged in Linux 2.6.23. It is the >> -replacement for the previous vanilla scheduler's SCHED_OTHER interactivity >> -code. >> +CFS stands for "Completely Fair Scheduler," and is the "desktop" process >> +scheduler implemented by Ingo Molnar and merged in Linux 2.6.23. When >> +originally merged, it was the replacement for the previous vanilla >> +scheduler's SCHED_OTHER interactivity code. Nowadays, CFS is making room >> +for EEVDF, for which documentation can be found in >> +:ref:`sched_design_EEVDF`. >> >> 80% of CFS's design can be summed up in a single sentence: CFS basically models >> an "ideal, precise multi-tasking CPU" on real hardware. >> diff --git a/Documentation/scheduler/sched-eevdf.rst b/Documentation/scheduler/sched-eevdf.rst >> new file mode 100644 >> index 000000000000..31ad8f995360 >> --- /dev/null >> +++ b/Documentation/scheduler/sched-eevdf.rst >> @@ -0,0 +1,44 @@ >> +.. _sched_design_EEVDF: >> + >> +=============== >> +EEVDF Scheduler >> +=============== >> + >> +The "Earliest Eligible Virtual Deadline First" (EEVDF) was first introduced >> +in a scientific publication in 1995 [1]. The Linux kernel began >> +transitioning to EEVDF in version 6.6 (as a new option in 2024), moving >> +away from the earlier Completely Fair Scheduler (CFS) in favor of a version >> +of EEVDF proposed by Peter Zijlstra in 2023 [2-4]. More information >> +regarding CFS can be found in :ref:`sched_design_CFS`. >> + >> +Similarly to CFS, EEVDF aims to distribute CPU time equally among all >> +runnable tasks with the same priority. To do so, it assigns a virtual run >> +time to each task, creating a "lag" value that can be used to determine >> +whether a task has received its fair share of CPU time. In this way, a task >> +with a positive lag is owed CPU time, while a negative lag means the task >> +has exceeded its portion. EEVDF picks tasks with lag greater or equal to >> +zero and calculates a virtual deadline (VD) for each, selecting the task >> +with the earliest VD to execute next. It's important to note that this >> +allows latency-sensitive tasks with shorter time slices to be prioritized, >> +which helps with their responsiveness. >> + >> +There are ongoing discussions on how to manage lag, especially for sleeping >> +tasks; but at the time of writing EEVDF uses a "decaying" mechanism based >> +on virtual run time (VRT). This prevents tasks from exploiting the system >> +by sleeping briefly to reset their negative lag: when a task sleeps, it >> +remains on the run queue but marked for "deferred dequeue," allowing its >> +lag to decay over VRT. Hence, long-sleeping tasks eventually have their lag >> +reset. Finally, tasks can preempt others if their VD is earlier, and tasks >> +can request specific time slices using the new sched_setattr() system call, >> +which further facilitates the job of latency-sensitive applications. >> + >> +4. REFERENCES >> +============= > > Why is this section numbered 4? > No other sections here are numbered. > >> + >> +[1] https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=805acf7726282721504c8f00575d91ebfd750564 >> + >> +[2] https://lore.kernel.org/lkml/a79014e6-ea83-b316-1e12-2ae056bda6fa@xxxxxxxxxxxxxxxxxx/ >> + >> +[3] https://lwn.net/Articles/969062/ >> + >> +[4] https://lwn.net/Articles/925371/ > > Other than those 2 comments: > > Reviewed-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx> > Tested-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx> Thank you for reviewing and providing feedback, Randy. I'm sending v2 now. > > > Thanks. > Thanks, Carlos