On Tue, 4 Feb 2025 12:28:41 +0900 Suleiman Souhlal <suleiman@xxxxxxxxxx> wrote: > Can you explain why this approach requires PREEMPT_LAZY? > > Could exit_to_user_mode_loop() be changed to something like the > following (with maybe some provision to only do it once)? > > if ((ti_work & _TIF_NEED_RESCHED) && !rseq_delay_resched()) > schedule(); The main reason is that we need to differentiate a preemption based on a SCHED_OTHER scheduling tick, and an RT task waking up. We should not delay any RT tasks ever. If PREEMPT_LAZY becomes default, IIUC then even the old "server" version will have RT tasks preempt tasks within the kernel without waiting for another tick. Currently, the only way to differentiate between a SCHED_OTHER scheduler tick preemption and an RT task waking up is with the NEED_RESCHED_LAZY vs NEED_RESCHED. Now, if we wanted to (and I'm not sure we do), we could add another way to differentiate the two and still allow this to work. > > I suppose there would also need to be some additional changes to make > sure full preemption also doesn't preempt, maybe in > preempt_schedule*(). Which may be quite difficult as the cr_counter is in user space and can only be read from a user space faultable context. -- Steve