Re: WARNING: events/ipi.h:36 suspicious rcu_dereference_check() usage!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 17, 2020 at 03:53:27PM +0900, Masami Hiramatsu wrote:
> Hello,
> 
> It seems that *arm/arm64 specific* IPI trace event has caused this issue.
> From the stacktrace, __update_max_tr() is invoked from do_idle() context.
> 
> ------
> __update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
> {
> [SNIP]
>         /* record this tasks comm */
>         tracing_record_cmdline(tsk);
>         latency_fsnotify(tr);  <<------here
> }
> ------
> And via latency_fsnotify(), it calls irq_work_queue().
> 
> ------
> void latency_fsnotify(struct trace_array *tr)
> {
>         if (!fsnotify_wq)
>                 return;
>         /*
>          * We cannot call queue_work(&tr->fsnotify_work) from here because it's
>          * possible that we are called from __schedule() or do_idle(), which
>          * could cause a deadlock.
>          */
>         irq_work_queue(&tr->fsnotify_irqwork); <<------- here
> }
> ------
> Please NOTE(*) that the above comment said that irq_work_queue()
> should be SAFE from calling in do_idle(), this means it doesn't touch
> any RCU.
> 
> And the irq_work_queue() finally kicks IPI via arch_irq_work_raise()
> which causes an IPI.
> 
> --------<arm64: arch/arm64/kernel/smp.c>--------
> #ifdef CONFIG_IRQ_WORK
> void arch_irq_work_raise(void)
> {
>         if (__smp_cross_call)
>                 smp_cross_call(cpumask_of(smp_processor_id()), IPI_IRQ_WORK);
> }
> #endif
> 
> static void smp_cross_call(const struct cpumask *target, unsigned int ipinr)
> {
>         trace_ipi_raise(target, ipi_types[ipinr]);  <<----- This
> causes a warning!
>         __smp_cross_call(target, ipinr);
> }
> -------
> And trace_* macro touch the RCU. See include/linux/tracepoint.h:231
> ------
> #define __DECLARE_TRACE(name, proto, args, cond, data_proto, data_args) \
>         extern struct tracepoint __tracepoint_##name;                   \
>         static inline void trace_##name(proto)                          \
>         {                                                               \
>                 if (static_key_false(&__tracepoint_##name.key))         \
>                         __DO_TRACE(&__tracepoint_##name,                \
>                                 TP_PROTO(data_proto),                   \
>                                 TP_ARGS(data_args),                     \
>                                 TP_CONDITION(cond), 0);                 \
>                 if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) {             \
>                         rcu_read_lock_sched_notrace();                  \
>                         rcu_dereference_sched(__tracepoint_##name.funcs);\
>                         rcu_read_unlock_sched_notrace();                \
>                 }                                                       \
>         }
> ------
> 
> BTW, this ipi_raise trace event is only used in arm/arm64.
> 
> $ git grep trace_ipi_raise
> arch/arm/kernel/smp.c:  trace_ipi_raise_rcuidle(target, ipi_types[ipinr]);
> arch/arm64/kernel/smp.c:        trace_ipi_raise(target, ipi_types[ipinr]);
> include/trace/events/ipi.h: * __tracepoint_string, ideally the same as
> used with trace_ipi_raise
> include/trace/events/ipi.h: * __tracepoint_string, ideally the same as
> used with trace_ipi_raise for
> 
> Thus, this only occurs on arm64 platform, but not on x86.
> 
> To fix this, maybe we need to remove this event or mark this is under
> rcu watched (but is it really watched?)

Good analysis, Masami :)

Seems to me, Arm64 should study Arm platform to change using
trace_ipi_raise_rcuidle() to work around the restriction so can allow the
event to work in idle context?

Thanks,
Leo

> 2020年4月8日(水) 18:31 Naresh Kamboju <naresh.kamboju@xxxxxxxxxx>:
> 
> >
> > On arm64 qemu_arm64, juno-r2 and dragonboard-410c while running kselftest ftrace
> > on stable rc 5.5.1-rc1 to till today 5.5.16-rc2 and 5.6  found this
> > kernel warning.
> >
> > [  386.349099] kselftest: Running tests in ftrace
> > [  393.984018]
> > [  393.984290] =============================
> > [  393.984781] WARNING: suspicious RCU usage
> > [  393.988690] 5.6.3-rc2 #1 Not tainted
> > [  393.992679] -----------------------------
> > [  393.996327] /usr/src/kernel/include/trace/events/ipi.h:36
> > suspicious rcu_dereference_check() usage!
> > [  394.000241]
> > [  394.000241] other info that might help us debug this:
> > [  394.000241]
> > [  394.009094]
> > [  394.009094] RCU used illegally from idle CPU!
> > [  394.009094] rcu_scheduler_active = 2, debug_locks = 1
> > [  394.017084] RCU used illegally from extended quiescent state!
> > [  394.028187] 1 lock held by swapper/3/0:
> > [  394.033826]  #0: ffff80001237b6a8 (max_trace_lock){....}, at:
> > check_critical_timing+0x7c/0x1a8
> > [  394.037480]
> > [  394.037480] stack backtrace:
> > [  394.046158] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.6.3-rc2 #1
> > [  394.050584] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
> > [  394.056663] Call trace:
> > [  394.063515]  dump_backtrace+0x0/0x1e0
> > [  394.065686]  show_stack+0x24/0x30
> > [  394.069505]  dump_stack+0xe8/0x150
> > [  394.072805]  lockdep_rcu_suspicious+0xcc/0x110
> > [  394.076106]  arch_irq_work_raise+0x208/0x210
> > [  394.080533]  __irq_work_queue_local+0x5c/0x80
> > [  394.084959]  irq_work_queue+0x38/0x78
> > [  394.089212]  __update_max_tr+0x150/0x218
> > [  394.092858]  update_max_tr_single.part.82+0x98/0x100
> > [  394.096851]  update_max_tr_single+0x1c/0x28
> > [  394.101798]  check_critical_timing+0x198/0x1a8
> > [  394.105705]  stop_critical_timings+0x128/0x148
> > [  394.110221]  cpuidle_enter_state+0x74/0x4f8
> > [  394.114645]  cpuidle_enter+0x3c/0x50
> > [  394.118726]  call_cpuidle+0x44/0x80
> > [  394.122542]  do_idle+0x22c/0x2d0
> > [  394.125755]  cpu_startup_entry+0x28/0x48
> > [  394.129229]  secondary_start_kernel+0x1b4/0x210
> >
> >
> > metadata:
> >   git branch: linux-5.5.y and linux-5.6.y
> >   git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> >   kernel-config:
> > http://snapshots.linaro.org/openembedded/lkft/lkft/sumo/juno/lkft/linux-stable-rc-5.5/65/config
> >
> > Full test log,
> > 5.6 test logs,
> > https://lkft.validation.linaro.org/scheduler/job/1350627#L12612
> > https://lkft.validation.linaro.org/scheduler/job/1350731#L9509
> >
> > 5.5 test logs,
> > https://lkft.validation.linaro.org/scheduler/job/1322704#L9777
> > https://lkft.validation.linaro.org/scheduler/job/1153369#L9745
> > https://lkft.validation.linaro.org/scheduler/job/1351155#L8982
> > https://lkft.validation.linaro.org/scheduler/job/1351065#L12349
> >
> > --
> > Linaro LKFT
> > https://lkft.linaro.org
> 
> 
> 
> --
> Masami Hiramatsu



[Index of Archives]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux