On 2025-02-11 10:28:01 [-0500], Steven Rostedt wrote: > On Tue, 11 Feb 2025 09:21:38 +0100 > Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> wrote: > > > We don't follow this behaviour exactly today. > > > > Adding this behaviour back vs the behaviour we have now, doesn't seem to > > improve anything at visible levels. We don't have a counter but we can > > look at the RCU nesting counter which should be zero once locks have > > been dropped. So this can be used for testing. > > > > But as I said: using "run to completion" and preempt on the return > > userland rather than once the lazy flag is seen and all locks have been > > released appears to be better. > > > > It is (now) possible that you run for a long time and get preempted > > while holding a spinlock_t. It is however more likely that you release > > all locks and get preempted while returning to userland. > > IIUC, today, LAZY causes all SCHED_OTHER tasks to act more like > PREEMPT_NONE. Is that correct? Well. First sched-tick will set the LAZY bit, the second sched-tick forces a resched. On PREEMPT_NONE the sched-tick would be set NEED_RESCHED while nothing will force a resched until the task decides to do schedule() on its own. So it is slightly different for kernel threads. Unless we talk about userland, here we would have a resched on the return to userland after the sched-tick LAZY or NONE does not matter. > Now that the PREEMPT_RT is not one of the preemption selections, when you > select PREEMPT_RT, you can pick between CONFIG_PREEMPT and > CONFIG_PREEMPT_LAZY. Where CONFIG_PREEMPT will preempt the kernel at the > scheduler tick if preemption is enabled and CONFIG_PREEMPT_LAZY will > not preempt the kernel on a scheduler tick and wait for exit to user space. This is not specific to RT but FULL vs LAZY. But yes. However the second sched-tick will force preemption point even without the exit-to-userland. > Sebastian, > > It appears you only tested the CONFIG_PREEMPT_LAZY selection. Have you > tested the difference of how CONFIG_PREEMPT behaves between PREEMPT_RT and > no PREEMPT_RT? I think that will show a difference like we had in the past. Not that I remember testing PREEMPT vs PREEMPT_RT. I remember people complained about high networking load on RT which become visible due to threaded interrupts (as in top) while for non-RT it was more or less hidden and not clearly visible due to selected accounting. The network performance was mostly the same as far as I remember (that is gbit). > I can see people picking both PREEMPT_RT and CONFIG_PREEMPT (Full), but > then wondering why their non RT tasks are suffering from a performance > penalty from that. We might want to opt in for lazy by default on RT. That was the case in the RT queue until it was replaced with PREEMPT_AUTO. But then why not use LAZY in favour of PREEMPT. Mike had numbers https://lore.kernel.org/all/9df22ebbc2e6d426099bf380477a0ed885068896.camel@xxxxxx/ where LAZY had mostly the voluntary performance with less context switches than preempt. Which means also without the need for cond_resched() and friends. > -- Steve Sebastian