Patch "powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion"" has been added to the 4.3-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion"

to the 4.3-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     powerpc-opal-irqchip-fix-deadlock-introduced-by-fix-double-endian-conversion.patch
and it can be found in the queue-4.3 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.


>From 036592fbbe753d236402a0ae68148e7c143a0f0e Mon Sep 17 00:00:00 2001
From: Alistair Popple <alistair@xxxxxxxxxxxx>
Date: Fri, 18 Dec 2015 17:16:17 +1100
Subject: powerpc/opal-irqchip: Fix deadlock introduced by "Fix double endian conversion"

From: Alistair Popple <alistair@xxxxxxxxxxxx>

commit 036592fbbe753d236402a0ae68148e7c143a0f0e upstream.

Commit 25642e1459ac ("powerpc/opal-irqchip: Fix double endian
conversion") fixed an endian bug by calling opal_handle_events() in
opal_event_unmask().

However this introduced a deadlock if we find an event is active
during unmasking and call opal_handle_events() again. The bad call
sequence is:

  opal_interrupt()
  -> opal_handle_events()
     -> generic_handle_irq()
        -> handle_level_irq()
           -> raw_spin_lock(&desc->lock)
              handle_irq_event(desc)
              unmask_irq(desc)
              -> opal_event_unmask()
                 -> opal_handle_events()
                    -> generic_handle_irq()
                       -> handle_level_irq()
                          -> raw_spin_lock(&desc->lock)	(BOOM)

When generating multiple opal events in quick succession this would lead
to the following stall warnings:

EEH: Fenced PHB#0 detected, location: U78C9.001.WZS09XA-P1-C32
INFO: rcu_sched detected stalls on CPUs/tasks:

         12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=2065
         15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=2065
         (detected by 13, t=2102 jiffies, g=1325, c=1324, q=602)
NMI watchdog: BUG: soft lockup - CPU#18 stuck for 22s! [irqbalance:2696]
INFO: rcu_sched detected stalls on CPUs/tasks:
         12-...: (1 GPs behind) idle=68f/140000000000001/0 softirq=860/861 fqs=8371
         15-...: (1 GPs behind) idle=be5/140000000000001/0 softirq=1142/1143 fqs=8371
         (detected by 20, t=8407 jiffies, g=1325, c=1324, q=1290)

This patch corrects the problem by queuing the work if an event is
active during unmasking, which is similar to the pre-endian fix
behaviour.

Fixes: 25642e1459ac ("powerpc/opal-irqchip: Fix double endian conversion")
Signed-off-by: Alistair Popple <alistair@xxxxxxxxxxxx>
Reported-by: Andrew Donnellan <andrew.donnellan@xxxxxxxxxxx>
Signed-off-by: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
 arch/powerpc/platforms/powernv/opal-irqchip.c |   14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

--- a/arch/powerpc/platforms/powernv/opal-irqchip.c
+++ b/arch/powerpc/platforms/powernv/opal-irqchip.c
@@ -83,7 +83,19 @@ static void opal_event_unmask(struct irq
 	set_bit(d->hwirq, &opal_event_irqchip.mask);
 
 	opal_poll_events(&events);
-	opal_handle_events(be64_to_cpu(events));
+	last_outstanding_events = be64_to_cpu(events);
+
+	/*
+	 * We can't just handle the events now with opal_handle_events().
+	 * If we did we would deadlock when opal_event_unmask() is called from
+	 * handle_level_irq() with the irq descriptor lock held, because
+	 * calling opal_handle_events() would call generic_handle_irq() and
+	 * then handle_level_irq() which would try to take the descriptor lock
+	 * again. Instead queue the events for later.
+	 */
+	if (last_outstanding_events & opal_event_irqchip.mask)
+		/* Need to retrigger the interrupt */
+		irq_work_queue(&opal_event_irq_work);
 }
 
 static int opal_event_set_type(struct irq_data *d, unsigned int flow_type)


Patches currently in stable-queue which might be from alistair@xxxxxxxxxxxx are

queue-4.3/powerpc-opal-irqchip-fix-double-endian-conversion.patch
queue-4.3/powerpc-opal-irqchip-fix-deadlock-introduced-by-fix-double-endian-conversion.patch
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]