n Tue, 4 Feb 2025 08:16:53 -0500 Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > On Tue, 4 Feb 2025 07:51:00 -0500 > Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > > > I'm so confused, WTF do you then need the lazy crap? > > IOW, the "lazy crap" was created to solve this very issue. The holding of > sleeping spin locks interrupted by a scheduler tick. I'm just giving user > space the same feature that we gave the kernel in PREEMPT_RT. > Also, I believe it is best to follow the current preemption method and that's what the NEED_RESCHED_LAZY gives us. Let's say you have a low priority program, maybe even malicious, that goes into a loop of calling a system call that can run for almost a millisecond without sleeping. In PREEMPT_NONE, this low priority program can cause RT tasks a latency of a millisecond because if an RT task wakes up as the program just enters the system call, it will have to wait for it to exit that system call for it to run, which might be close to that millisecond. For PREEMPT_VOLUNTARY, it will only preempt tasks until it hits a might_sleep(), or cond_resched() (but so would PREEMPT_NONE on the cond_resched(), but we want to get rid of those). For PREEMPT_FULL, the program shouldn't affect any other task because its system call will simply be preempted. Now let's look at this new feature. It allows a task to ask for some extended time to get out of a critical section if possible. If we decide in the future that we remove PREEMPT_NONE and PREEMPT_VOLUNTARY with a dynamic type like: TYPE | Sched Tick | RT Wakeup | Enter user space | ===========+================+=============+====================+ None | Set LAZY | Set LAZY | schedule | -----------+----------------+-------------+--------------------+ Voluntary? | Set LAZY | schedule | schedule | -----------+----------------+-------------+--------------------+ Full | schedule | schedule | schedule | -----------+----------------+-------------+--------------------+ (The "Enter user space" is when a NEED_RESCHED is set) Where in NONE, the LAZY flag is set for both the sched tick and the RT wakeup and it doesn't schedule until it hits user space. In "Voluntary", the LAZY flag is set only for sched tick on SCHED_OTHER tasks, but RT tasks will get to be scheduled immediately (depending on preempt_disable of course). With "Full" it will schedule whenever it can. With that task that calls that long system call, which method type above is in place determines the latency of other tasks. Now, if we add this feature, I want it to behave the same as a long system call. Where it would only extend the time if a long system call would extend the time, as that means it wouldn't modify the typical behavior of the system for other tasks, but it would help in the performance for the task that is requesting this feature. With this feature: TYPE | Sched Tick | RT Wakeup | Enter user space | ===========+================+=============+=======================+ None | Set LAZY | Set LAZY | schedule if !LAZY | -----------+----------------+-------------+-----------------------+ Voluntary? | Set LAZY | schedule | schedule if !LAZY | -----------+----------------+-------------+-----------------------+ Full | schedule | schedule | schedule | -----------+----------------+-------------+-----------------------+ Thus, in NONE, it would likely get to extend its time just like if it called a long system call. This can include even making RT tasks wait a little longer, just like they would wait on a system call. In "Voluntary", it would only get its timeslice extended if it was another SCHED_OTHER task that is to be scheduled. But if an RT task would wake up, it would schedule immediately regardless if a extended time slice was requested or not. In "Full", it probably makes sense to simply disable this feature (the program would see that it is disabled when it registers the rseq), as it would never get its time slice extended, as a system call would be preempted immediately if it was interrupted. So back to your question about why I'm tying this to the "lazy crap", is because I want the behavior of other tasks to not change due to one task asking for an extended time slice. -- Steve