On Fri, Nov 19, 2021 at 10:01:00AM +0800, Pingfan Liu wrote: > On Wed, Nov 17, 2021 at 11:38:54AM +0000, Mark Rutland wrote: > > On Tue, Nov 16, 2021 at 04:24:47PM +0800, Pingfan Liu wrote: > > > Linux kernel places strict semantics between NMI and maskable interrupt. > > > So does the RCU component, else deadlock may happen. But the current > > > arm64 entry code can partially breach this rule through calling > > > rcu_nmi_enter(). > > > > > > *** how a deadlock can happen if NMI mistaken as IRQ *** > > > > > > rcu_nmi_enter() > > > { > > > > > > if (rcu_dynticks_curr_cpu_in_eqs()) { > > > > > > if (!in_nmi()) > > > rcu_dynticks_task_exit(); > > > ... > > > if (!in_nmi()) { > > > instrumentation_begin(); > > > rcu_cleanup_after_idle(); > > > instrumentation_end(); > > > } > > > ... > > > } else if (!in_nmi()) { > > > instrumentation_begin(); > > > rcu_irq_enter_check_tick(); > > > } > > > > > > } > > > > > > If a NMI is mistaken as a maskable interrupt, rcu_irq_enter_check_tick() > > > can hit a deadlock, which is demonstrated by the following scenario: > > > > > > note_gp_changes() runs in a task context > > > { > > > local_irq_save(flags); // this protects against irq, but not NMI > > > > Note: speecifically, this masks IRQs via ICC_PMR_EL1. > > > > > rnp = rdp->mynode; > > > ... > > > raw_spin_trylock_rcu_node(rnp) > > > -------> broken in by (p)NMI, without taking __nmi_enter() > > > > AFAICT the broken case described here *cannot* happen. > > > > If we take a pNMI here, we'll go: > > > > el1h_64_irq() > > -> el1h64_irq_handler() > > -> el1_interrupt() > > > > ... where el1_interrupt is: > > > > | static void noinstr el1_interrupt(struct pt_regs *regs, > > | void (*handler)(struct pt_regs *)) > > | { > > | write_sysreg(DAIF_PROCCTX_NOIRQ, daif); > > | > > | if (IS_ENABLED(CONFIG_ARM64_PSEUDO_NMI) && !interrupts_enabled(regs)) > > | __el1_pnmi(regs, handler); > > | else > > | __el1_irq(regs, handler); > > | } > > > > ... and interrupts_enabled() is: > > > > | #define interrupts_enabled(regs) \ > > | (!((regs)->pstate & PSR_I_BIT) && irqs_priority_unmasked(regs)) > > > > ... and irqs_priority_unmasked is: > > > > | #define irqs_priority_unmasked(regs) \ > > | (system_uses_irq_prio_masking() ? \ > > | (regs)->pmr_save == GIC_PRIO_IRQON : \ > > | true) > > > > ... irqs_priority_unmasked(regs) *must* return false for this case, > > since the local_irq_save() above must have set the PMR to > > GIC_PRIO_IRQOFF in the interrupted context. > > > > Thus, we *must* call __el1_pnmi() in this case, where __el1_pnmi() is: > > > > Yes, you are right. And I made a mistake to think it will call > __el1_irq() > As above, I made a mistake about the condition hence the code path. > There is no problem in this case through calling __el1_pnmi() > > Sorry for the false alarm and thank you again for your detailed > explaination. No worries; thanks for confirming this was a false alarm. Glad we're now on the same page. :) Likewise, thanks for reporting what you thought was an issue; having more eyes on this is always good! Thanks, Mark.