Patch "powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened" has been added to the 4.9-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened

to the 4.9-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     powerpc-64s-fix-lost-pending-interrupt-due-to-race-causing-lost-update-to-irq_happened.patch
and it can be found in the queue-4.9 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.


>From ff6781fd1bb404d8a551c02c35c70cec1da17ff1 Mon Sep 17 00:00:00 2001
From: Nicholas Piggin <npiggin@xxxxxxxxx>
Date: Wed, 21 Mar 2018 12:22:28 +1000
Subject: powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened

From: Nicholas Piggin <npiggin@xxxxxxxxx>

commit ff6781fd1bb404d8a551c02c35c70cec1da17ff1 upstream.

force_external_irq_replay() can be called in the do_IRQ path with
interrupts hard enabled and soft disabled if may_hard_irq_enable() set
MSR[EE]=1. It updates local_paca->irq_happened with a load, modify,
store sequence. If a maskable interrupt hits during this sequence, it
will go to the masked handler to be marked pending in irq_happened.
This update will be lost when the interrupt returns and the store
instruction executes. This can result in unpredictable latencies,
timeouts, lockups, etc.

Fix this by ensuring hard interrupts are disabled before modifying
irq_happened.

This could cause any maskable asynchronous interrupt to get lost, but
it was noticed on P9 SMP system doing RDMA NVMe target over 100GbE,
so very high external interrupt rate and high IPI rate. The hang was
bisected down to enabling doorbell interrupts for IPIs. These provided
an interrupt type that could run at high rates in the do_IRQ path,
stressing the race.

Fixes: 1d607bb3bd60 ("powerpc/irq: Add mechanism to force a replay of interrupts")
Cc: stable@xxxxxxxxxxxxxxx # v4.8+
Reported-by: Carol L. Soto <clsoto@xxxxxxxxxx>
Signed-off-by: Nicholas Piggin <npiggin@xxxxxxxxx>
Signed-off-by: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
 arch/powerpc/kernel/irq.c |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -372,6 +372,14 @@ void force_external_irq_replay(void)
 	 */
 	WARN_ON(!arch_irqs_disabled());
 
+	/*
+	 * Interrupts must always be hard disabled before irq_happened is
+	 * modified (to prevent lost update in case of interrupt between
+	 * load and store).
+	 */
+	__hard_irq_disable();
+	local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
+
 	/* Indicate in the PACA that we have an interrupt to replay */
 	local_paca->irq_happened |= PACA_IRQ_EE;
 }


Patches currently in stable-queue which might be from npiggin@xxxxxxxxx are

queue-4.9/powerpc-64s-fix-lost-pending-interrupt-due-to-race-causing-lost-update-to-irq_happened.patch
queue-4.9/powerpc-64s-fix-i-side-slb-miss-bad-address-handler-saving-nonvolatile-gprs.patch



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]