On Tue, Feb 18, 2020 at 12:17:28PM -0800, Paul E. McKenney wrote: > On Tue, Feb 18, 2020 at 08:58:31PM +0100, Peter Zijlstra wrote: > > On Thu, Feb 13, 2020 at 03:44:44PM -0500, Joel Fernandes wrote: > > > > > > > That _should_ already be the case today. That is, if we end up in a > > > > > tracer and in_nmi() is unreliable we're already screwed anyway. > > > > > I removed the static from rcu_nmi_enter()/exit() as it is called from > > > outside, that makes it build now. Updated below is Paul's diff. I also added > > > NOKPROBE_SYMBOL() to rcu_nmi_exit() to match rcu_nmi_enter() since it seemed > > > asymmetric. > > > > > +__always_inline void rcu_nmi_exit(void) > > > { > > > struct rcu_data *rdp = this_cpu_ptr(&rcu_data); > > > > > > @@ -651,25 +653,15 @@ static __always_inline void rcu_nmi_exit_common(bool irq) > > > trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks)); > > > WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */ > > > > > > - if (irq) > > > + if (!in_nmi()) > > > rcu_prepare_for_idle(); > > > > > > rcu_dynticks_eqs_enter(); > > > > > > - if (irq) > > > + if (!in_nmi()) > > > rcu_dynticks_task_enter(); > > > } > > > > Boris and me have been going over the #MC code (and finding loads of > > 'interesting' code) and ran into ist_enter(), whish has the following > > code: > > > > /* > > * We might have interrupted pretty much anything. In > > * fact, if we're a machine check, we can even interrupt > > * NMI processing. We don't want in_nmi() to return true, > > * but we need to notify RCU. > > */ > > rcu_nmi_enter(); > > > > > > Which, to me, sounds all sorts of broken. The IST (be it #DB or #MC) can > > happen while we're holding all sorts of locks. This must be an NMI-like > > context. > > Ouch! Looks like I need to hold off on getting rid of the "irq" > parameters if in_nmi() isn't going to be accurate. I'm currently trying to twist my brain around all this, because I suspect it's all completely broken one way or another. But yes, we definitely need to fix this before your patch goes in.