Hello Thomas, As the irqchip maintainer, this patch, as well as the following patch "irqchip: armada-370-xp: fix MSI race condition" should go through your tree. They have been sent for quite some time now. Is there any reason you haven't picked them up? I realized I only Cc'ed you when sending them originally, while you should have been the main recipient, maybe that explains why I didn't get any feedback. Would it be possible to merge them for 3.13 ? Notice that the below patch should also be backported into stable kernel all the way to 3.8. Thanks a lot! Thomas On Mon, 25 Nov 2013 17:26:44 +0100, Thomas Petazzoni wrote: > From: Lior Amsalem <alior@xxxxxxxxxxx> > > In the Armada 370/XP driver, when we receive an IRQ 0, we read the > list of doorbells that caused the interrupt from register > ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS. This gives the list of IPIs that > were generated. However, instead of acknowledging only the IPIs that > were generated, we acknowledge *all* the IPIs, by writing > ~IPI_DOORBELL_MASK in the ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS register. > > This creates a race condition: if a new IPI that isn't part of the > ones read into the temporary "ipimask" variable is fired before we > acknowledge all IPIs, then we will simply loose it. This is causing > scheduling hangs on SMP intensive workloads. > > It is important to mention that this ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS > register has the following behavior: "A CPU write of 0 clears the bits > in this field. A CPU write of 1 has no effect". This is what allows us > to simply write ~ipimask to acknoledge the handled IPIs. > > Notice that the same problem is present in the MSI implementation, but > it will be fixed as a separate patch, so that this IPI fix can be > pushed to older stable versions as appropriate (all the way to 3.8), > while the MSI code only appeared in 3.13. > > Signed-off-by: Lior Amsalem <alior@xxxxxxxxxxx> > Signed-off-by: Thomas Petazzoni <thomas.petazzoni@xxxxxxxxxxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > --- > The problem has been present since 344e873e5657e8dc0 ('arm: mvebu: Add > IPI support via doorbells'), that is since v3.8. However, notice that > the IRQ driver was moved from arch/arm/mach-mvebu/ to drivers/irqchip > in the process, and also that the very line being changed was slightly > modified in 5ec69017cc944f3ed8 ('irqchip: armada-370-xp: slightly > cleanup irq controller driver'). > --- > drivers/irqchip/irq-armada-370-xp.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/irqchip/irq-armada-370-xp.c b/drivers/irqchip/irq-armada-370-xp.c > index 433cc85..f5e49a2 100644 > --- a/drivers/irqchip/irq-armada-370-xp.c > +++ b/drivers/irqchip/irq-armada-370-xp.c > @@ -407,7 +407,7 @@ armada_370_xp_handle_irq(struct pt_regs *regs) > ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS) > & IPI_DOORBELL_MASK; > > - writel(~IPI_DOORBELL_MASK, per_cpu_int_base + > + writel(~ipimask, per_cpu_int_base + > ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS); > > /* Handle all pending doorbells */ -- Thomas Petazzoni, CTO, Free Electrons Embedded Linux, Kernel and Android engineering http://free-electrons.com -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html