Re: [PATCHv3 1/4] arm64: entry: judge nmi ealier to avoid deadlock in RCU

Pingfan Liu <kernelfans@xxxxxxxxx> · Fri, 19 Nov 2021 10:01:00 +0800

On Wed, Nov 17, 2021 at 11:38:54AM +0000, Mark Rutland wrote:
> On Tue, Nov 16, 2021 at 04:24:47PM +0800, Pingfan Liu wrote:
> > Linux kernel places strict semantics between NMI and maskable interrupt.
> > So does the RCU component, else deadlock may happen. But the current
> > arm64 entry code can partially breach this rule through calling
> > rcu_nmi_enter().
> > 
> > ***  how a deadlock can happen if NMI mistaken as IRQ  ***
> > 
> >     rcu_nmi_enter()
> >     {
> > 
> >         if (rcu_dynticks_curr_cpu_in_eqs()) {
> > 
> >         	if (!in_nmi())
> >         		rcu_dynticks_task_exit();
> >                 ...
> >         	if (!in_nmi()) {
> >         		instrumentation_begin();
> >         		rcu_cleanup_after_idle();
> >         		instrumentation_end();
> >         	}
> >                 ...
> >         } else if (!in_nmi()) {
> >         	instrumentation_begin();
> >         	rcu_irq_enter_check_tick();
> >         }
> > 
> >     }
> > 
> > If a NMI is mistaken as a maskable interrupt, rcu_irq_enter_check_tick()
> > can hit a deadlock, which is demonstrated by the following scenario:
> > 
> >     note_gp_changes() runs in a task context
> >     {
> >        local_irq_save(flags); // this protects against irq, but not NMI
> 
> Note: speecifically, this masks IRQs via ICC_PMR_EL1.
> 
> >        rnp = rdp->mynode;
> >        ...
> >        raw_spin_trylock_rcu_node(rnp)
> >                                   -------> broken in by (p)NMI, without taking __nmi_enter()
> 
> AFAICT the broken case described here *cannot* happen.
> 
> If we take a pNMI here, we'll go:
> 
>   el1h_64_irq()
>   -> el1h64_irq_handler()
>   -> el1_interrupt()
> 
> ... where el1_interrupt is:
> 
> | static void noinstr el1_interrupt(struct pt_regs *regs,
> |                                   void (*handler)(struct pt_regs *)) 
> | {
> |         write_sysreg(DAIF_PROCCTX_NOIRQ, daif);
> | 
> |         if (IS_ENABLED(CONFIG_ARM64_PSEUDO_NMI) && !interrupts_enabled(regs))
> |                 __el1_pnmi(regs, handler);
> |         else
> |                 __el1_irq(regs, handler);
> | } 
> 
> ... and interrupts_enabled() is:
> 
> | #define interrupts_enabled(regs)                        \
> |        (!((regs)->pstate & PSR_I_BIT) && irqs_priority_unmasked(regs))
> 
> ... and irqs_priority_unmasked is:
> 
> | #define irqs_priority_unmasked(regs)                                    \
> |         (system_uses_irq_prio_masking() ?                               \
> |                 (regs)->pmr_save == GIC_PRIO_IRQON :                    \
> |                 true)
> 
> ... irqs_priority_unmasked(regs) *must* return false for this case,
> since the local_irq_save() above must have set the PMR to
> GIC_PRIO_IRQOFF in the interrupted context.
> 
> Thus, we *must* call __el1_pnmi() in this case, where __el1_pnmi() is:
> 

Yes, you are right. And I made a mistake to think it will call
__el1_irq()
> | static __always_inline void __el1_pnmi(struct pt_regs *regs,
> |                                        void (*handler)(struct pt_regs *)) 
> | {
> |         arm64_enter_nmi(regs);
> |         do_interrupt_handler(regs, handler);
> |         arm64_exit_nmi(regs);
> | }
> 
> Note that arm64_{enter,exit}_nmi() handle the __nmi_{enter,exit}() calls around
> do_interupt_handler().
> 
> >                                            rcu_nmi_enter()
> >                                             ->__rcu_irq_enter_check_tick()
> >                                                ->raw_spin_lock_rcu_node(rdp->mynode);
> >                                                    deadlock happens!!!
> > 
> >     }
> 
> As above, I don't think this can go wrong as you describe, and don't believe
> that this can deadlock.
> 
> > *** On arm64, how pNMI mistaken as IRQ ***
> > 
> > On arm64, pNMI is an analogue to NMI. In essence, it is a higher
> > priority interrupt but not disabled by local_irq_disable().
> > 
> > In current implementation
> > 1) If a pNMI from a context where IRQs were masked, it can be recognized
> >    as nmi, and calls __nmi_enter() immediately. This is no problem.
> 
> Agreed, this case works correctly.
> 
> > 2) But it causes trouble if a pNMI from a context where IRQs were
> >    unmasked, and temporarily regarded as maskable interrupt.  It is not
> >    treated as NMI, i.e. calling nmi_enter() until reading from GIC.
> > 
> >     __el1_irq()
> >     {
> >       irq_enter_rcu()   ----> hit the deadlock bug
> 
> What is "the deadlock bug"? The example you had above was for a context where
> IRQs were *masked*, which is a different case.
> 

As above, I made a mistake about the condition hence the code path.
There is no problem in this case through calling __el1_pnmi()

Sorry for the false alarm and thank you again for your detailed
explaination.

Regards,

	Pingfan