Re: [PATCH v3] rcu/tree: Add a trace event for RCU stall warnings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 02, 2021 at 10:34:26PM +0530, Neeraj Upadhyay wrote:
> On 3/2/2021 5:25 PM, Sangmoon Kim wrote:
> > The event allows us to trace the RCU stall when
> > sysctl_panic_on_rcu_stall is disabled.
> > 
> > The first parameter is the name of RCU flavour like other trace
> > events. The second one shows us which function detected stalls.
> > 
> > The RCU stall is mainly caused by external factors such as interrupt
> > handling or task scheduling or something else. Therefore, this event
> > uses TRACE_EVENT macro, not dedicated one, so that someone interested
> > in the RCU stall can use it without CONFIG_RCU_TRACE.
> > 
> > Signed-off-by: Sangmoon Kim <sangmoon.kim@xxxxxxxxxxx>
> > Reviewed-by: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx>

[ . . . ]

> Reviewed-by: Neeraj Upadhyay <neeraju@xxxxxxxxxxxxxx>

Thank you all!  As usual, I wordsmithed the commit log as shown below.
Please let me know if I messed anything up.

							Thanx, Paul

------------------------------------------------------------------------

commit 4ee0eb7c0cbccaae8e5e3681d852d4e7f50c4378
Author: Sangmoon Kim <sangmoon.kim@xxxxxxxxxxx>
Date:   Tue Mar 2 20:55:15 2021 +0900

    rcu/tree: Add a trace event for RCU CPU stall warnings
    
    This commit adds a trace event which allows tracing the beginnings of RCU
    CPU stall warnings on systems where sysctl_panic_on_rcu_stall is disabled.
    
    The first parameter is the name of RCU flavor like other trace events.
    The second parameter indicates whether this is a stall of an expedited
    grace period, a self-detected stall of a normal grace period, or a stall
    of a normal grace period detected by some CPU other than the one that
    is stalled.
    
    RCU CPU stall warnings are often caused by external-to-RCU issues,
    for example, in interrupt handling or task scheduling.  Therefore,
    this event uses TRACE_EVENT, not TRACE_EVENT_RCU, to avoid requiring
    those interested in tracing RCU CPU stalls to rebuild their kernels
    with CONFIG_RCU_TRACE=y.
    
    Reviewed-by: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx>
    Reviewed-by: Neeraj Upadhyay <neeraju@xxxxxxxxxxxxxx>
    Signed-off-by: Sangmoon Kim <sangmoon.kim@xxxxxxxxxxx>
    Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>

diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index 5fc2940..c7711e9 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -432,6 +432,34 @@ TRACE_EVENT_RCU(rcu_fqs,
 		  __entry->cpu, __entry->qsevent)
 );
 
+/*
+ * Tracepoint for RCU stall events. Takes a string identifying the RCU flavor
+ * and a string identifying which function detected the RCU stall as follows:
+ *
+ *	"StallDetected": Scheduler-tick detects other CPU's stalls.
+ *	"SelfDetected": Scheduler-tick detects a current CPU's stall.
+ *	"ExpeditedStall": Expedited grace period detects stalls.
+ */
+TRACE_EVENT(rcu_stall_warning,
+
+	TP_PROTO(const char *rcuname, const char *msg),
+
+	TP_ARGS(rcuname, msg),
+
+	TP_STRUCT__entry(
+		__field(const char *, rcuname)
+		__field(const char *, msg)
+	),
+
+	TP_fast_assign(
+		__entry->rcuname = rcuname;
+		__entry->msg = msg;
+	),
+
+	TP_printk("%s %s",
+		  __entry->rcuname, __entry->msg)
+);
+
 #endif /* #if defined(CONFIG_TREE_RCU) */
 
 /*
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 6c6ff06..2796084 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -521,6 +521,7 @@ static void synchronize_rcu_expedited_wait(void)
 		if (rcu_stall_is_suppressed())
 			continue;
 		panic_on_rcu_stall();
+		trace_rcu_stall_warning(rcu_state.name, TPS("ExpeditedStall"));
 		pr_err("INFO: %s detected expedited stalls on CPUs/tasks: {",
 		       rcu_state.name);
 		ndetected = 0;
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 475b261..59b95cc 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -536,6 +536,7 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
 	 * See Documentation/RCU/stallwarn.rst for info on how to debug
 	 * RCU CPU stall warnings.
 	 */
+	trace_rcu_stall_warning(rcu_state.name, TPS("StallDetected"));
 	pr_err("INFO: %s detected stalls on CPUs/tasks:\n", rcu_state.name);
 	rcu_for_each_leaf_node(rnp) {
 		raw_spin_lock_irqsave_rcu_node(rnp, flags);
@@ -606,6 +607,7 @@ static void print_cpu_stall(unsigned long gps)
 	 * See Documentation/RCU/stallwarn.rst for info on how to debug
 	 * RCU CPU stall warnings.
 	 */
+	trace_rcu_stall_warning(rcu_state.name, TPS("SelfDetected"));
 	pr_err("INFO: %s self-detected stall on CPU\n", rcu_state.name);
 	raw_spin_lock_irqsave_rcu_node(rdp->mynode, flags);
 	print_cpu_stall_info(smp_processor_id());



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux