[PATCH] usb, ehci: Avoid deadlock of ehci->lock by disabling interrupts

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We ran into an interesting deadlock on RHEL-5 (2.6.18) that I believe
still appiles to the current kernel involving the ehci->lock.

CPU A:
submits a bulk transfer urb
ehci_urb_enqueue calls submit_async
submit_async blocks on ehci->lock with irq disabled (the result
of spin_lock_irqsave) for CPU B

CPU B:
takes an ehci interrupt
locks ehci->lock
pre-empted by an IPI handler which spins waiting for CPU C

CPU C:
takes an MTRR request
sends an IPI to all cpus to block
spins waiting for all cpus to block

CPU A nevers processes IPI because its interrupts are disabled,
this creates the 3-way deadlock.

This deadlock is hard to reproduce by our customer, but based on their vmcore
it seems clear the above is what happened.  I attatched a suggested patch
from a colleague that would seem to resolve the problem.  Because it is
hard to reproduce, I have not been able to test it to verify it resolves
the problem.

The patch just turns spin_locks in the spin_lock_irqsaves in the ehci_irq
function.  This would essentially block the IPI handler and let the interrupt
handler finish before processing the IPI.  Then CPU A would get a chance to
finish and process its IPI.

Looking at the code paths in 2.6.18 and 3.5, the locking still seems the same
which is why I believe the problem still exists.  However, someone in the office
thought the MTRR code has been re-written, so the problem we are seeing might
be more difficult to see with the current kernel.

This patch does feel awkward, disabling interrupts in the irq handler.  It seems
like it would make more sense to remove the locking from the irq handler.  But
that is probably more work and my knowledge of USB is limited.  I'll start with
this patch and see where the conversation goes.

Any feedback would be appreciated.

[dzickus - rewrote the description, ported to v3.5]

Signed-off-by: Casey Dahlin <cdahlin@xxxxxxxxxx>
Signed-off-by: Don Zickus <dzickus@xxxxxxxxxx>
---
 drivers/usb/host/ehci-hcd.c |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index 800be38..a31d9cc 100644
--- a/drivers/usb/host/ehci-hcd.c
+++ b/drivers/usb/host/ehci-hcd.c
@@ -849,8 +849,9 @@ static irqreturn_t ehci_irq (struct usb_hcd *hcd)
 	struct ehci_hcd		*ehci = hcd_to_ehci (hcd);
 	u32			status, masked_status, pcd_status = 0, cmd;
 	int			bh;
+	unsigned long		flags;
 
-	spin_lock (&ehci->lock);
+	spin_lock_irqsave(&ehci->lock, flags);
 
 	status = ehci_readl(ehci, &ehci->regs->status);
 
@@ -868,7 +869,7 @@ static irqreturn_t ehci_irq (struct usb_hcd *hcd)
 
 	/* Shared IRQ? */
 	if (!masked_status || unlikely(ehci->rh_state == EHCI_RH_HALTED)) {
-		spin_unlock(&ehci->lock);
+		spin_unlock_irqrestore(&ehci->lock, flags);
 		return IRQ_NONE;
 	}
 
@@ -969,7 +970,7 @@ dead:
 
 	if (bh)
 		ehci_work (ehci);
-	spin_unlock (&ehci->lock);
+	spin_unlock_irqrestore(&ehci->lock, flags);
 	if (pcd_status)
 		usb_hcd_poll_rh_status(hcd);
 	return IRQ_HANDLED;
-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux