Re: ohci: sporadic crash/lockup in ohci-hcd io_watchdog_func()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 19 Jan 2015, Heiko Przybyl wrote:

> Hi all,
> 
> been redirected here from bug #91511 [1].
> 
> I'm getting sporadic crashes in io_watchdog_func() in drivers/usb/host/ohci-
> hcd.c:761:
> "
> list_for_each_entry(ed, &ohci->eds_in_use, in_use_list) {
>                 if (ed->pending_td) {
> "
> with the in_use list getting corrupted:
> 
> from ohci_urb_enqueue():
> [43656.918055] list_add double add: new=ffff8800cbaa8040, prev=ffff8800cb8aa5b8, 
> next=ffff8800cbaa8040.
> from ohci_work.part():
> [43656.920980] list_del corruption. next->prev should be ffff8800cbaa8040, but 
> was ffff8800cb8aa5b8
> 
> One or both set the pointer to 0xdead000000100100 and 0xdead0000001000c0, 
> where io_watchdog_func() chokes on [2].

That is indeed a bad bug.

> It seems to be related to keyboard input (at least it happens when using the 
> keyboard), without relation to system load. Can happen within a day after boot 
> or after several days of hibernated uptime. Unfortunately, I haven't found a 
> way to reliably reproduce the issue, yet.
> 
> The box is a "Gigabyte GA-78LMT-USB3" with "AMD FX(tm)-6300 Six-Core 
> Processor" and "[AMD/ATI] SB7x0 USB OHCI1 Controller".
> 
> For more info including crash trace, just have a look at the bug report [1]
> 
> 
> My (pretty wild) guess is, that the corruption happens through a race in the 
> interrupt handler ohci_irq(), which calls ohci_work(), which calls 
> finish_urb(), which states:
> " * PRECONDITION:  ohci lock held, irqs blocked"
> 
> But ohci_irq() seems to only spin_[un]lock(), not spin_[un]lock_irq[save|
> restore](). All other functions that call ohci_work() do at least 
> spin_[un]lock_irq. So irqs could still be enabled and possibly the event 
> triggered twice, thus the double list add?

That's easy enough to test.  All you have to do is change the 
spin_lock/unlock statements to their irq_save/restore variants.

ohci_irq() is an interrupt handler.  In the absence of threaded IRQs,
he kernel should always call interrupt handlers with interrupts 
disabled.  Do you specify "threadirqs" on your boot command line?

If that's not the explanation then we'll have to dig deeper.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux