On Fri, 5 Aug 2016 16:35:55 +0200 Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> wrote: > * Steven Rostedt | 2016-08-04 13:16:45 [-0400]: > > >diff --git a/include/linux/ftrace_irq.h b/include/linux/ftrace_irq.h > >index dca7bf8cffe2..4ec2c9b205f2 100644 > >--- a/include/linux/ftrace_irq.h > >+++ b/include/linux/ftrace_irq.h > >@@ -3,11 +3,34 @@ > … > >+static inline void ftrace_nmi_enter(void) > >+{ > >+#ifdef CONFIG_HWLAT_TRACER > >+ if (trace_hwlat_callback_enabled) > >+ trace_hwlat_callback(true); > > so we take a tracepoint while we enter an nmi It's not technically a tracepoint. I'm not sure tracepoints (jumplabels) may be located this early in the NMI handler. This is before some of the magic of having NMIs dealing with page faults and break points. > > >--- a/kernel/trace/trace_hwlat.c > >+++ b/kernel/trace/trace_hwlat.c > >@@ -64,6 +64,15 @@ static struct dentry *hwlat_sample_window; /* sample window us */ > > /* Save the previous tracing_thresh value */ > > static unsigned long save_tracing_thresh; > > > >+/* NMI timestamp counters */ > >+static u64 nmi_ts_start; > >+static u64 nmi_total_ts; > >+static int nmi_count; > >+static int nmi_cpu; > > and this is always limited to one CPU at a time? Yes. Hence the "nmi_cpu". > > … > >@@ -125,6 +138,19 @@ static void trace_hwlat_sample(struct hwlat_sample *sample) > > #define init_time(a, b) (a = b) > > #define time_u64(a) a > > > >+void trace_hwlat_callback(bool enter) > >+{ > >+ if (smp_processor_id() != nmi_cpu) > >+ return; > >+ > >+ if (enter) > >+ nmi_ts_start = time_get(); > > but more interestingly: trace_clock_local() -> sched_clock() > and of kernel/time/sched_clock.c we do raw_read_seqcount(&cd.seq) which > means we are busted if the NMI triggers during update_clock_read_data(). Hmm, interesting. Because this is true for general tracing from an NMI. /me looks at code. Ah, this is when we have GENERIC_SCHED_CLOCK, which would break tracing if any arch that has this also has NMIs. Probably need to look at arm64. For x86, it has its own NMI safe sched_clock. I could make this "NMI" code depend on: #ifndef CONFIG_GENERIC_SCHED_CLOCK -- Steve > > >+ else { > >+ nmi_total_ts = time_get() - nmi_ts_start; > >+ nmi_count++; > >+ } > >+} > > Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html