This is a note to let you know that I've just added the patch titled rcu: Avoid tracing a few functions executed in stop machine to the 5.15-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: rcu-avoid-tracing-a-few-functions-executed-in-stop-machine.patch and it can be found in the queue-5.15 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From 48f8070f5dd8e13148ae4647780a452d53c457a2 Mon Sep 17 00:00:00 2001 From: Patrick Wang <patrick.wang.shcn@xxxxxxxxx> Date: Tue, 26 Apr 2022 18:45:02 +0800 Subject: rcu: Avoid tracing a few functions executed in stop machine From: Patrick Wang <patrick.wang.shcn@xxxxxxxxx> commit 48f8070f5dd8e13148ae4647780a452d53c457a2 upstream. Stop-machine recently started calling additional functions while waiting: ---------------------------------------------------------------- Former stop machine wait loop: do { cpu_relax(); => macro ... } while (curstate != STOPMACHINE_EXIT); ----------------------------------------------------------------- Current stop machine wait loop: do { stop_machine_yield(cpumask); => function (notraced) ... touch_nmi_watchdog(); => function (notraced, inside calls also notraced) ... rcu_momentary_dyntick_idle(); => function (notraced, inside calls traced) } while (curstate != MULTI_STOP_EXIT); ------------------------------------------------------------------ These functions (and the functions that they call) must be marked notrace to prevent them from being updated while they are executing. The consequences of failing to mark these functions can be severe: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: rcu: 1-...!: (0 ticks this GP) idle=14f/1/0x4000000000000000 softirq=3397/3397 fqs=0 rcu: 3-...!: (0 ticks this GP) idle=ee9/1/0x4000000000000000 softirq=5168/5168 fqs=0 (detected by 0, t=8137 jiffies, g=5889, q=2 ncpus=4) Task dump for CPU 1: task:migration/1 state:R running task stack: 0 pid: 19 ppid: 2 flags:0x00000000 Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174 Call Trace: Task dump for CPU 3: task:migration/3 state:R running task stack: 0 pid: 29 ppid: 2 flags:0x00000000 Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174 Call Trace: rcu: rcu_preempt kthread timer wakeup didn't happen for 8136 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 rcu: Possible timer handling issue on cpu=2 timer-softirq=594 rcu: rcu_preempt kthread starved for 8137 jiffies! g5889 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=2 rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. rcu: RCU grace-period kthread stack dump: task:rcu_preempt state:I stack: 0 pid: 14 ppid: 2 flags:0x00000000 Call Trace: schedule+0x56/0xc2 schedule_timeout+0x82/0x184 rcu_gp_fqs_loop+0x19a/0x318 rcu_gp_kthread+0x11a/0x140 kthread+0xee/0x118 ret_from_exception+0x0/0x14 rcu: Stack dump where RCU GP kthread last ran: Task dump for CPU 2: task:migration/2 state:R running task stack: 0 pid: 24 ppid: 2 flags:0x00000000 Stopper: multi_cpu_stop+0x0/0x18c <- stop_machine_cpuslocked+0x128/0x174 Call Trace: This commit therefore marks these functions notrace: rcu_preempt_deferred_qs() rcu_preempt_need_deferred_qs() rcu_preempt_deferred_qs_irqrestore() [ paulmck: Apply feedback from Neeraj Upadhyay. ] Signed-off-by: Patrick Wang <patrick.wang.shcn@xxxxxxxxx> Acked-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> Reviewed-by: Neeraj Upadhyay <quic_neeraju@xxxxxxxxxxx> Signed-off-by: Ronald Monthero <debug.penguin32@xxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- kernel/rcu/tree_plugin.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -458,7 +458,7 @@ static bool rcu_preempt_has_tasks(struct * be quite short, for example, in the case of the call from * rcu_read_unlock_special(). */ -static void +static notrace void rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags) { bool empty_exp; @@ -578,7 +578,7 @@ rcu_preempt_deferred_qs_irqrestore(struc * is disabled. This function cannot be expected to understand these * nuances, so the caller must handle them. */ -static bool rcu_preempt_need_deferred_qs(struct task_struct *t) +static notrace bool rcu_preempt_need_deferred_qs(struct task_struct *t) { return (__this_cpu_read(rcu_data.exp_deferred_qs) || READ_ONCE(t->rcu_read_unlock_special.s)) && @@ -592,7 +592,7 @@ static bool rcu_preempt_need_deferred_qs * evaluate safety in terms of interrupt, softirq, and preemption * disabling. */ -static void rcu_preempt_deferred_qs(struct task_struct *t) +static notrace void rcu_preempt_deferred_qs(struct task_struct *t) { unsigned long flags; @@ -923,7 +923,7 @@ static bool rcu_preempt_has_tasks(struct * Because there is no preemptible RCU, there can be no deferred quiescent * states. */ -static bool rcu_preempt_need_deferred_qs(struct task_struct *t) +static notrace bool rcu_preempt_need_deferred_qs(struct task_struct *t) { return false; } Patches currently in stable-queue which might be from patrick.wang.shcn@xxxxxxxxx are queue-5.15/rcu-avoid-tracing-a-few-functions-executed-in-stop-machine.patch