Re: WARNING: events/ipi.h:36 suspicious rcu_dereference_check() usage!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Leo,

2020年4月17日(金) 16:15 Leo Yan <leo.yan@xxxxxxxxxx>:
>
> On Fri, Apr 17, 2020 at 03:53:27PM +0900, Masami Hiramatsu wrote:
> > Hello,
> >
> > It seems that *arm/arm64 specific* IPI trace event has caused this issue.
> > From the stacktrace, __update_max_tr() is invoked from do_idle() context.
> >

Good point! Yes, we can use trace_ipi_raise_rcuidle() to fix that.

Thank you,

> > ------
> > __update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
> > {
> > [SNIP]
> >         /* record this tasks comm */
> >         tracing_record_cmdline(tsk);
> >         latency_fsnotify(tr);  <<------here
> > }
> > ------
> > And via latency_fsnotify(), it calls irq_work_queue().
> >
> > ------
> > void latency_fsnotify(struct trace_array *tr)
> > {
> >         if (!fsnotify_wq)
> >                 return;
> >         /*
> >          * We cannot call queue_work(&tr->fsnotify_work) from here because it's
> >          * possible that we are called from __schedule() or do_idle(), which
> >          * could cause a deadlock.
> >          */
> >         irq_work_queue(&tr->fsnotify_irqwork); <<------- here
> > }
> > ------
> > Please NOTE(*) that the above comment said that irq_work_queue()
> > should be SAFE from calling in do_idle(), this means it doesn't touch
> > any RCU.
> >
> > And the irq_work_queue() finally kicks IPI via arch_irq_work_raise()
> > which causes an IPI.
> >
> > --------<arm64: arch/arm64/kernel/smp.c>--------
> > #ifdef CONFIG_IRQ_WORK
> > void arch_irq_work_raise(void)
> > {
> >         if (__smp_cross_call)
> >                 smp_cross_call(cpumask_of(smp_processor_id()), IPI_IRQ_WORK);
> > }
> > #endif
> >
> > static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
> > {
> >         trace_ipi_raise(target, ipi_types[ipinr]);  <<----- This
> > causes a warning!
> >         __smp_cross_call(target, ipinr);
> > }
> > -------
> > And trace_* macro touch the RCU. See include/linux/tracepoint.h:231
> > ------
> > #define __DECLARE_TRACE(name, proto, args, cond, data_proto, data_args) \
> >         extern struct tracepoint __tracepoint_##name;                   \
> >         static inline void trace_##name(proto)                          \
> >         {                                                               \
> >                 if (static_key_false(&__tracepoint_##name.key))         \
> >                         __DO_TRACE(&__tracepoint_##name,                \
> >                                 TP_PROTO(data_proto),                   \
> >                                 TP_ARGS(data_args),                     \
> >                                 TP_CONDITION(cond), 0);                 \
> >                 if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) {             \
> >                         rcu_read_lock_sched_notrace();                  \
> >                         rcu_dereference_sched(__tracepoint_##name.funcs);\
> >                         rcu_read_unlock_sched_notrace();                \
> >                 }                                                       \
> >         }
> > ------
> >
> > BTW, this ipi_raise trace event is only used in arm/arm64.
> >
> > $ git grep trace_ipi_raise
> > arch/arm/kernel/smp.c:  trace_ipi_raise_rcuidle(target, ipi_types[ipinr]);
> > arch/arm64/kernel/smp.c:        trace_ipi_raise(target, ipi_types[ipinr]);
> > include/trace/events/ipi.h: * __tracepoint_string, ideally the same as
> > used with trace_ipi_raise
> > include/trace/events/ipi.h: * __tracepoint_string, ideally the same as
> > used with trace_ipi_raise for
> >
> > Thus, this only occurs on arm64 platform, but not on x86.
> >
> > To fix this, maybe we need to remove this event or mark this is under
> > rcu watched (but is it really watched?)
>
> Good analysis, Masami :)
>
> Seems to me, Arm64 should study Arm platform to change using
> trace_ipi_raise_rcuidle() to work around the restriction so can allow the
> event to work in idle context?
>
> Thanks,
> Leo
>
> > 2020年4月8日(水) 18:31 Naresh Kamboju <naresh.kamboju@xxxxxxxxxx>:
> >
> > >
> > > On arm64 qemu_arm64, juno-r2 and dragonboard-410c while running kselftest ftrace
> > > on stable rc 5.5.1-rc1 to till today 5.5.16-rc2 and 5.6  found this
> > > kernel warning.
> > >
> > > [  386.349099] kselftest: Running tests in ftrace
> > > [  393.984018]
> > > [  393.984290] =============================
> > > [  393.984781] WARNING: suspicious RCU usage
> > > [  393.988690] 5.6.3-rc2 #1 Not tainted
> > > [  393.992679] -----------------------------
> > > [  393.996327] /usr/src/kernel/include/trace/events/ipi.h:36
> > > suspicious rcu_dereference_check() usage!
> > > [  394.000241]
> > > [  394.000241] other info that might help us debug this:
> > > [  394.000241]
> > > [  394.009094]
> > > [  394.009094] RCU used illegally from idle CPU!
> > > [  394.009094] rcu_scheduler_active = 2, debug_locks = 1
> > > [  394.017084] RCU used illegally from extended quiescent state!
> > > [  394.028187] 1 lock held by swapper/3/0:
> > > [  394.033826]  #0: ffff80001237b6a8 (max_trace_lock){....}, at:
> > > check_critical_timing+0x7c/0x1a8
> > > [  394.037480]
> > > [  394.037480] stack backtrace:
> > > [  394.046158] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.6.3-rc2 #1
> > > [  394.050584] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> > > [  394.056663] Call trace:
> > > [  394.063515]  dump_backtrace+0x0/0x1e0
> > > [  394.065686]  show_stack+0x24/0x30
> > > [  394.069505]  dump_stack+0xe8/0x150
> > > [  394.072805]  lockdep_rcu_suspicious+0xcc/0x110
> > > [  394.076106]  arch_irq_work_raise+0x208/0x210
> > > [  394.080533]  __irq_work_queue_local+0x5c/0x80
> > > [  394.084959]  irq_work_queue+0x38/0x78
> > > [  394.089212]  __update_max_tr+0x150/0x218
> > > [  394.092858]  update_max_tr_single.part.82+0x98/0x100
> > > [  394.096851]  update_max_tr_single+0x1c/0x28
> > > [  394.101798]  check_critical_timing+0x198/0x1a8
> > > [  394.105705]  stop_critical_timings+0x128/0x148
> > > [  394.110221]  cpuidle_enter_state+0x74/0x4f8
> > > [  394.114645]  cpuidle_enter+0x3c/0x50
> > > [  394.118726]  call_cpuidle+0x44/0x80
> > > [  394.122542]  do_idle+0x22c/0x2d0
> > > [  394.125755]  cpu_startup_entry+0x28/0x48
> > > [  394.129229]  secondary_start_kernel+0x1b4/0x210
> > >
> > >
> > > metadata:
> > >   git branch: linux-5.5.y and linux-5.6.y
> > >   git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> > >   kernel-config:
> > > http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/juno/lkft/linux-stable-rc-5.5/65/config
> > >
> > > Full test log,
> > > 5.6 test logs,
> > > https://lkft.validation.linaro.org/scheduler/job/1350627#L12612
> > > https://lkft.validation.linaro.org/scheduler/job/1350731#L9509
> > >
> > > 5.5 test logs,
> > > https://lkft.validation.linaro.org/scheduler/job/1322704#L9777
> > > https://lkft.validation.linaro.org/scheduler/job/1153369#L9745
> > > https://lkft.validation.linaro.org/scheduler/job/1351155#L8982
> > > https://lkft.validation.linaro.org/scheduler/job/1351065#L12349
> > >
> > > --
> > > Linaro LKFT
> > > https://lkft.linaro.org
> >
> >
> >
> > --
> > Masami Hiramatsu



-- 
Masami Hiramatsu




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux