On Wed, 17 Nov 2021 10:16:53 +0000, Pingfan Liu <kernelfans@xxxxxxxxx> wrote: > > On Tue, Nov 16, 2021 at 09:53:25AM +0000, Marc Zyngier wrote: > > Hi Pingfan, > > > > On Tue, 16 Nov 2021 08:24:49 +0000, > > Pingfan Liu <kernelfans@xxxxxxxxx> wrote: > > > > > > Arch level code is ready to take over the nmi_enter()/nmi_exit() > > > housekeeping. > > > > > > GICv3 can expose the pNMI discriminator, then simply remove the > > > housekeeping. > > > > > > Signed-off-by: Pingfan Liu <kernelfans@xxxxxxxxx> > > > Cc: Catalin Marinas <catalin.marinas@xxxxxxx> > > > Cc: Will Deacon <will@xxxxxxxxxx> > > > Cc: Mark Rutland <mark.rutland@xxxxxxx> > > > Cc: Marc Zyngier <maz@xxxxxxxxxx> > > > Cc: Joey Gouly <joey.gouly@xxxxxxx> > > > Cc: Sami Tolvanen <samitolvanen@xxxxxxxxxx> > > > Cc: Julien Thierry <julien.thierry@xxxxxxx> > > > Cc: Yuichi Ito <ito-yuichi@xxxxxxxxxxx> > > > Cc: rcu@xxxxxxxxxxxxxxx > > > To: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx > > > --- > > > drivers/irqchip/irq-gic-v3.c | 18 ++++++++++++------ > > > 1 file changed, 12 insertions(+), 6 deletions(-) > > > > > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c > > > index daec3309b014..aa2bcb47b47e 100644 > > > --- a/drivers/irqchip/irq-gic-v3.c > > > +++ b/drivers/irqchip/irq-gic-v3.c > > > @@ -646,12 +646,8 @@ static void gic_deactivate_unhandled(u32 irqnr) > > > > > > static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs) > > > { > > > - bool irqs_enabled = interrupts_enabled(regs); > > > int err; > > > > > > - if (irqs_enabled) > > > - nmi_enter(); > > > - > > > if (static_branch_likely(&supports_deactivate_key)) > > > gic_write_eoir(irqnr); > > > /* > > > @@ -664,8 +660,6 @@ static inline void gic_handle_nmi(u32 irqnr, struct pt_regs *regs) > > > if (err) > > > gic_deactivate_unhandled(irqnr); > > > > > > - if (irqs_enabled) > > > - nmi_exit(); > > > } > > > > > > static u32 do_read_iar(struct pt_regs *regs) > > > @@ -702,6 +696,15 @@ static u32 do_read_iar(struct pt_regs *regs) > > > return iar; > > > } > > > > > > +static bool gic_is_in_nmi(void) > > > +{ > > > + if (gic_supports_nmi() && > > > + unlikely(gic_read_rpr() == GICD_INT_RPR_PRI(GICD_INT_NMI_PRI))) > > > + return true; > > > > I don't think this fixes anything. > > > > RPR stands for 'Running Priority Register', which in GIC speak reports > > the priority of the most recently Ack'ed interrupt. > > > > You cannot use this to find out whether the interrupt that you /will/ > > ack is a NMI or not. Actually, you cannot find out about *any* > > priority until you actually ack the interrupt. What you are asking for > > is the equivalent of a crystal ball, and we're in short supply... ;-) > > > > The only case where ICC_RPR_EL1 will return something that is equal to > > GICD_INT_NMI_PRI is when you are *already* in an NMI context. So > > unless I have completely misunderstood your approach (which is always > > possible), I don't see how this can work. > > > > Thank you for the clear explanation. Also I revist this part in "GIC v3 > and v4 overview" and have a deeper understanding. (Need to spare time to > go through all later) > > You totally got my idea, and I need to find a bail-out. > > As all kinds of PIC at least have two parts of functions: active (Ack) and > deactive(EOI), is it possible to split handle_arch_irq into two parts? > I.e let irqchip expose two interfaces: > u32 (*read_irqinfo*)(struct pt_regs *regs, bool *is_nmi) > void (*handle_arch_irq)(struct pt_regs *regs, u32 irqnr) > to replace the current interface: No. There is no way we will move to such a scheme. We want to isolate the irqchip stuff in its own corner, and not propagate it into the arch code. If the pseudo-NMI is such a problem, I'm all for removing it *now*, never to come back again. > void (*handle_arch_irq)(struct pt_regs *regs) > > I have thought about such stuff for some days. And the benefits include: > -1. For this bugfix (by the parameter 'is_nmi') > -2. IPI_RESCHEDULE performance drop issue can be resolved at arch > code level. (by irqnr - ipi_irq_base == IPI_RESCHEDULE ?) > -3. The arch level can provide a similar loop as aic_handle_irq() in irq-apple-aic.c, > which can save cpu by avoiding heavy context sync when irq is > intensive. > > Do you think it is doable? I don't think this is even needed, because I don't believe that the whole thing is a real problem. In patch #1, you are claiming that handling a NMI as an IRQ first, and then upgrading to NMI once we know it really is an NMI is a problem. How different is this from an IRQ being preempted by a NMI? Your own conclusion is that the this later case isn't a problem. So why is the former an issue? I'm not saying that there is no issue at all, and it could well be that you have spotted something that I cannot see yet. But if that's the case, it means that the core code is broken as well. M. -- Without deviation from the norm, progress is not possible.