Re: Hitting "unused qh not empty" BUG in qh_destroy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 16 Sep 2014, Joe Lawrence wrote:

> Quick turn around on this one :)
> 
> crash> log | grep 0000:2c:00.0
> 
> pci 0000:2c:00.0: [8086:1d26] type 00 class 0x0c0320
> pci 0000:2c:00.0: reg 0x10: [mem 0x90000000-0x900003ff]
> pci 0000:2c:00.0: PME# supported from D0 D3hot D3cold
> ehci-pci 0000:2c:00.0: EHCI Host Controller
> ehci-pci 0000:2c:00.0: new USB bus registered, assigned bus number 1
> ehci-pci 0000:2c:00.0: debug port 2
> ehci-pci 0000:2c:00.0: cache line size of 64 is not supported
> ehci-pci 0000:2c:00.0: irq 10, io mem 0x90000000
> ehci-pci 0000:2c:00.0: USB 2.0 started, EHCI 1.00
> usb usb1: SerialNumber: 0000:2c:00.0
> ehci-pci 0000:2c:00.0: qh_link_async:1003 ehci = ffff88084ff1b088, head->qh_next.qh =           (null), qh = ffff88083eacae50
> ehci-pci 0000:2c:00.0: HC died; cleaning up
> ehci-pci 0000:2c:00.0: ehci_endpoint_disable:944 ep->hcpriv = ffff88083eacae50, ehci(ffff88084ff1b088)->num_async = 1, ehci->async->qh_next.qh = ffff88083eacae50
> 
> ------------[ cut here ]------------
> kernel BUG at drivers/usb/host/ehci-hcd.c:1002!

Aha!  So we triggered the BUG_ON you added to ehci_endpoint_disable.

> invalid opcode: 0000 [#1] SMP 
> ...
> CPU: 0 PID: 207 Comm: khubd Tainted: PF          O--------------   3.10.0-123.6.3.el7.bigphysarea_expedient1_usb4.x86_64 #1
> Hardware name: Stratus ftServer 6400/G7LAZ, BIOS BIOS Version 6.3:57 12/25/2013
> task: ffff880853ff8cb0 ti: ffff880853434000 task.ti: ffff880853434000
> RIP: 0010:[<ffffffff814157c6>]  [<ffffffff814157c6>] ehci_endpoint_disable+0x1f6/0x200

This means we have to check what's going on inside
ehci_endpoint_disable.  We know qh wasn't NULL and we know that qh->hw
wasn't NULL (otherwise we would have jumped over the BUG_ON).

In particular, which case of the "switch" statement did we hit?

... And now I see the problem.  It's these two lines just before the 
"switch":

	if (ehci->rh_state < EHCI_RH_RUNNING)
		qh->qh_state = QH_STATE_IDLE;

That undoubtedly caused us to destroy the QH directly without unlinking 
it first.

I'm pretty sure those two lines aren't needed any more.  Try removing 
them and see if the problem persists.

> Just to review, I pasted the USB related patches to this kernel below.  The
> drivers/usb/core/hub.c changes include:
> 
>   d8521af "usb: assign default peer ports for root hubs" (maxchild parts)
>   c605f3c "usb: hub: take hub->hdev reference when processing from eventlist"
> 
> the rest are debugging for the qh_destroy BUG.

Yes, the patches look fine.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux