Re: USB storage vanilla kernel 3.13 hang on DELL PRECISION M6400

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 12 Feb 2014, Stefani Seibold wrote:

> > > > >> Okay, the debugging info in your dmesg log indicates the cause of the
> > > > >> problem.  It looks like the bug is related to commit 88ed9fd50e57
> > > > >> (usb/hcd: remove unnecessary local_irq_save) by Michael Opdenacker.  

For the benefit of people who haven't seen the log, here is the 
important part:

[    3.431781] [ INFO: inconsistent lock state ]
[    3.431784] 3.13.2 #4 Not tainted
[    3.431786] ---------------------------------
[    3.431788] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[    3.431792] swapper/3/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
[    3.431794]  (&(&ehci->lock)->rlock){?.-...}, at: [<c04c9ad1>] ehci_hrtimer_func+0x21/0xc0
[    3.431805] {HARDIRQ-ON-W} state was registered at:
[    3.431807]   [<c0183750>] __lock_acquire+0x590/0x1bb0
[    3.431813]   [<c01852fe>] lock_acquire+0x7e/0x110
[    3.431818]   [<c06bef4f>] _raw_spin_lock+0x3f/0x50
[    3.431833]   [<c04d0ce7>] ehci_irq+0x27/0x3d0
[    3.431835]   [<c04b2521>] usb_hcd_irq+0x21/0x30
[    3.431839]   [<c018f596>] irq_forced_thread_fn+0x26/0x50
[    3.431842]   [<c018f38e>] irq_thread+0xfe/0x130
[    3.431844]   [<c015b5bb>] kthread+0x9b/0xb0

This says that ehci->lock was acquired by ehci_irq() with interrupts
enabled.  Then later on, ehci_hrtimer_func() acquired the same lock
with interrupts disabled.  This caused the lockdep violation (and it 
eventually caused the system to hang).

The thing is, ehci_irq() is a non-threaded IRQ handler.  It's _never_
supposed to run with interrupts enabled.  As far as I can see, this
happened because irq_forced_thread_fn() did not disable interrupts
before calling the handler routine.

> > > > >> (Note: As far as I can tell, the commit itself is okay, but it exposes 
> > > > >> a bug somewhere else in the kernel.)
> > > > >>
> > > > >> If you revert that commit from 3.13, does it fix the problem?
> > > > >>
> > > > > Reverting the commit 88ed9fd50e57 solve the problem. Thank you so much.
> > > > Oops, I'll try to reproduce and investigate. Thanks for the
> > > > investigations!!!
> > > > 
> > > 
> > > I think the problem has maybe to do with the threadirqs kernel
> > > parameter.
> > 
> > I don't think threaded irqs work very well with USB, can you try turning
> > that off and seeing if the issue goes away?

There's no reason in principle why the USB stack shouldn't work with 
threaded IRQs, as far as I know.

> I use threaded irqs since more than two years without any problem. It
> works with OHCI, UHCI, EHCI and XHCI.
> 
> This was the first time that an problem occurred.

I have no idea what might have changed between 3.12 and 3.13 to cause 
this problem.  Maybe Thomas can figure it out.

> And yes, the issues goes away when no thread irqs are used (with and
> without the patch).

Thomas, there must be some reason why the patch below is wrong, but I
don't know enough about the IRQ subsystem to tell what's really going
on.  Can you explain it?

Alan Stern



Index: usb-3.14/kernel/irq/manage.c
===================================================================
--- usb-3.14.orig/kernel/irq/manage.c
+++ usb-3.14/kernel/irq/manage.c
@@ -777,9 +777,12 @@ static irqreturn_t
 irq_forced_thread_fn(struct irq_desc *desc, struct irqaction *action)
 {
 	irqreturn_t ret;
+	unsigned long flags;
 
 	local_bh_disable();
+	local_irq_save(flags);
 	ret = action->thread_fn(action->irq, action->dev_id);
+	local_irq_restore(flags);
 	irq_finalize_oneshot(desc, action);
 	local_bh_enable();
 	return ret;

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux