From: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx> I'm debugging some latency issues on a Chromebook and the preemptirqsoff tracer hit this: # tracer: preemptirqsoff # # preemptirqsoff latency trace v1.1.5 on 5.15.148-21853-g165fd2387469-dirty # -------------------------------------------------------------------- # latency: 7813 us, #60/60, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) # ----------------- # | task: init-1 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # => started at: rwsem_optimistic_spin # => ended at: rwsem_optimistic_spin # # # _------=> CPU# # / _-----=> irqs-off # | / _----=> need-resched # || / _---=> hardirq/softirq # ||| / _--=> preempt-depth # |||| / _-=> migrate-disable # ||||| / delay # cmd pid |||||| time | caller # \ / |||||| \ | / <...>-1 1...1. 0us!: rwsem_optimistic_spin+0x20/0x194 <-rwsem_optimistic_spin+0x20/0x194 <...>-1 1.N.1. 7813us : rwsem_optimistic_spin+0x140/0x194 <-rwsem_optimistic_spin+0x140/0x194 <...>-1 1.N.1. 7814us+: tracer_preempt_on+0x4c/0x6a <-rwsem_optimistic_spin+0x140/0x194 <...>-1 1.N.1. 7824us : <stack trace> => rwsem_optimistic_spin+0x140/0x194 => rwsem_down_write_slowpath+0xc9/0x3fe => copy_process+0xd08/0x1b8a => kernel_clone+0x94/0x256 => __x64_sys_clone+0x7a/0x9a => do_syscall_64+0x51/0xa1 => entry_SYSCALL_64_after_hwframe+0x5c/0xc6 Which isn't a real issue, as it's in the rwsem_optimistic_spin() code that spins with interrupts enabled, preempt disabled, and checks for need_resched(). If it is true, it breaks out and schedules. Hence, it's hiding real issues, because it can spin for a very long time and this is not the source of the latency I'm tracking down. I would like to introduce restart_critical_timings() and place it in locations that have this behavior. Signed-off-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx> diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h index 147feebd508c..e9f97f60bfc0 100644 --- a/include/linux/irqflags.h +++ b/include/linux/irqflags.h @@ -145,6 +145,13 @@ do { \ # define start_critical_timings() do { } while (0) #endif +/* Used in spins that check need_resched() with preemption disabled */ +static inline void restart_critical_timings(void) +{ + stop_critical_timings(); + start_critical_timings(); +} + #ifdef CONFIG_DEBUG_IRQFLAGS extern void warn_bogus_irq_restore(void); #define raw_check_bogus_irq_restore() \ diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c index 2340b6d90ec6..fa7b43e53d20 100644 --- a/kernel/locking/rwsem.c +++ b/kernel/locking/rwsem.c @@ -27,6 +27,7 @@ #include <linux/export.h> #include <linux/rwsem.h> #include <linux/atomic.h> +#include <linux/irqflags.h> #include <trace/events/lock.h> #ifndef CONFIG_PREEMPT_RT @@ -780,6 +781,7 @@ rwsem_spin_on_owner(struct rw_semaphore *sem) */ barrier(); + restart_critical_timings(); if (need_resched() || !owner_on_cpu(owner)) { state = OWNER_NONSPINNABLE; break; @@ -912,6 +914,7 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem) * a writer, need_resched() check needs to be done here. */ if (owner_state != OWNER_WRITER) { + restart_critical_timings(); if (need_resched()) break; if (rt_task(current) &&