Re: [PATCH v2 3/9] rcu,tracing: Create trace_rcu_{enter,exit}()

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Tue, 18 Feb 2020 20:58:31 +0100

On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote:

> > > That _should_ already be the case today. That is, if we end up in a
> > > tracer and in_nmi() is unreliable we're already screwed anyway.

> I removed the static from rcu_nmi_enter()/exit() as it is called from
> outside, that makes it build now. Updated below is Paul's diff. I also added
> NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed
> asymmetric.

> +__always_inline void rcu_nmi_exit(void)
>  {
>  	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
>  
> @@ -651,25 +653,15 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
>  	trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks));
>  	WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
>  
> -	if (irq)
> +	if (!in_nmi())
>  		rcu_prepare_for_idle();
>  
>  	rcu_dynticks_eqs_enter();
>  
> -	if (irq)
> +	if (!in_nmi())
>  		rcu_dynticks_task_enter();
>  }

Boris and me have been going over the #MC code (and finding loads of
'interesting' code) and ran into ist_enter(), whish has the following
code:

                /*
                 * We might have interrupted pretty much anything.  In
                 * fact, if we're a machine check, we can even interrupt
                 * NMI processing.  We don't want in_nmi() to return true,
                 * but we need to notify RCU.
                 */
                rcu_nmi_enter();

Which, to me, sounds all sorts of broken. The IST (be it #DB or #MC) can
happen while we're holding all sorts of locks. This must be an NMI-like
context.