On Sun, 23 Apr 2017 04:30:01 -0700 tip-bot for Thomas Gleixner <tipbot@xxxxxxxxx> wrote: > Commit-ID: 24db7a671bd5eea76b17138b976eb9a4072f1b7a > Gitweb: http://git.kernel.org/tip/24db7a671bd5eea76b17138b976eb9a4072f1b7a > Author: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > AuthorDate: Sun, 23 Apr 2017 12:17:13 +0200 > Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > CommitDate: Sun, 23 Apr 2017 13:24:34 +0200 > > trace/perf: Cure hotplug lock ordering issues > > The get_online_cpus() rework unearthed lock ordering issues. > > Reorder hotpluglock and event_mutex() to avoid circular locking > dependencies and convert static_key_slow_dec() to the cpuslocked version to > avoid hotplug lock recursion. > > That also requires to protect the call to perf_trace_destroy() against cpu > hotplug. > > Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > Cc: Steven Rostedt <rostedt@xxxxxxxxxxx> BTW, I only got an email with the tip commit. Was this sent out before committing to tip? > > --- > kernel/events/core.c | 2 ++ > kernel/trace/trace_events.c | 6 ++++++ > kernel/tracepoint.c | 4 ++-- > 3 files changed, 10 insertions(+), 2 deletions(-) > > diff --git a/kernel/events/core.c b/kernel/events/core.c > index b5b4f52f..997123c 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -7783,7 +7783,9 @@ EXPORT_SYMBOL_GPL(perf_tp_event); > > static void tp_perf_event_destroy(struct perf_event *event) > { > + get_online_cpus(); > perf_trace_destroy(event); > + put_online_cpus(); > } OK, so I merged the branch I had Linus pull into the commit just before this one in tip. Then I ran all my tracing tests and not one triggers any lockdep issues. Just to make sure this was still an issue, as soon as I ran: # perf record -e sched:sched_switch -a sleep 1 It triggered: [ 198.228443] ====================================================== [ 198.235892] [ INFO: possible circular locking dependency detected ] [ 198.243406] 4.11.0-test+ #541 Not tainted [ 198.248657] ------------------------------------------------------- [ 198.256144] perf/4027 is trying to acquire lock: [ 198.261972] (event_mutex){+.+.+.}, at: [<ffffffff81195713>] perf_trace_init+0x23/0x2d0 [ 198.271162] [ 198.271162] but task is already holding lock: [ 198.279353] (cpu_hotplug_lock.rw_sem){++++++}, at: [<ffffffff811b8280>] SYSC_perf_event_open+0x3c0/0x10f0 The point being, there's no place that holds event_mutex that actually requires having get_online_cpus(), besides perf. And I don't think perf needs it here either. It probably needs it for the hardware events, but does it really need it for the software ones? Is there a way to separate software event initialization from hardware ones? Or does it really need to have get_online_cpus() when enabling software events. Can we have it as a delayed case? Enable them after put_online_cpus() and disable them before the call to get_online_cpus()? I'm thinking it will be a hell of a lot easier to make perf not need to have get_online_cpus() taken when calling the tp events. Have it separated out if possible. Otherwise, you are going to be playing whack a mole for a long time if you want to inverse event_mutex with get_online_cpus(). -- Steve > > static int perf_tp_event_init(struct perf_event *event) > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c > index 9311654..8156bfd 100644 > --- a/kernel/trace/trace_events.c > +++ b/kernel/trace/trace_events.c > @@ -747,9 +747,11 @@ static int __ftrace_set_clr_event(struct trace_array *tr, const char *match, > { > int ret; > > + get_online_cpus(); > mutex_lock(&event_mutex); > ret = __ftrace_set_clr_event_nolock(tr, match, sub, event, set); > mutex_unlock(&event_mutex); > + put_online_cpus(); > > return ret; > } > @@ -1039,11 +1041,13 @@ event_enable_write(struct file *filp, const char __user *ubuf, size_t cnt, > case 0: > case 1: > ret = -ENODEV; > + get_online_cpus(); > mutex_lock(&event_mutex); > file = event_file_data(filp); > if (likely(file)) > ret = ftrace_event_enable_disable(file, val); > mutex_unlock(&event_mutex); > + put_online_cpus(); > break; > > default: > @@ -2902,6 +2906,7 @@ early_event_add_tracer(struct dentry *parent, struct trace_array *tr) > > int event_trace_del_tracer(struct trace_array *tr) > { > + get_online_cpus(); > mutex_lock(&event_mutex); > > /* Disable any event triggers and associated soft-disabled events */ > @@ -2924,6 +2929,7 @@ int event_trace_del_tracer(struct trace_array *tr) > tr->event_dir = NULL; > > mutex_unlock(&event_mutex); > + put_online_cpus(); > > return 0; > } > diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c > index 685c50a..10ce910 100644 > --- a/kernel/tracepoint.c > +++ b/kernel/tracepoint.c > @@ -220,7 +220,7 @@ static int tracepoint_add_func(struct tracepoint *tp, > */ > rcu_assign_pointer(tp->funcs, tp_funcs); > if (!static_key_enabled(&tp->key)) > - static_key_slow_inc(&tp->key); > + static_key_slow_inc_cpuslocked(&tp->key); > release_probes(old); > return 0; > } > @@ -250,7 +250,7 @@ static int tracepoint_remove_func(struct tracepoint *tp, > tp->unregfunc(); > > if (static_key_enabled(&tp->key)) > - static_key_slow_dec(&tp->key); > + static_key_slow_dec_cpuslocked(&tp->key); > } > rcu_assign_pointer(tp->funcs, tp_funcs); > release_probes(old); -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
![]() |