On Tue, 11 Feb 2025 09:21:38 +0100 Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> wrote: > We don't follow this behaviour exactly today. > > Adding this behaviour back vs the behaviour we have now, doesn't seem to > improve anything at visible levels. We don't have a counter but we can > look at the RCU nesting counter which should be zero once locks have > been dropped. So this can be used for testing. > > But as I said: using "run to completion" and preempt on the return > userland rather than once the lazy flag is seen and all locks have been > released appears to be better. > > It is (now) possible that you run for a long time and get preempted > while holding a spinlock_t. It is however more likely that you release > all locks and get preempted while returning to userland. IIUC, today, LAZY causes all SCHED_OTHER tasks to act more like PREEMPT_NONE. Is that correct? Now that the PREEMPT_RT is not one of the preemption selections, when you select PREEMPT_RT, you can pick between CONFIG_PREEMPT and CONFIG_PREEMPT_LAZY. Where CONFIG_PREEMPT will preempt the kernel at the scheduler tick if preemption is enabled and CONFIG_PREEMPT_LAZY will not preempt the kernel on a scheduler tick and wait for exit to user space. Sebastian, It appears you only tested the CONFIG_PREEMPT_LAZY selection. Have you tested the difference of how CONFIG_PREEMPT behaves between PREEMPT_RT and no PREEMPT_RT? I think that will show a difference like we had in the past. I can see people picking both PREEMPT_RT and CONFIG_PREEMPT (Full), but then wondering why their non RT tasks are suffering from a performance penalty from that. -- Steve