Re: Hitting "unused qh not empty" BUG in qh_destroy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 11 Sep 2014, Joe Lawrence wrote:

> Hi Alan,
> 
> I've got another USB bug to report that manifests during automated
> device removal testing on RHEL7.  This one hits the BUG() inside
> qh_destroy:

How reliably can you trigger this bug?

>  67 static void qh_destroy(struct ehci_hcd *ehci, struct ehci_qh *qh)
>  68 {
>  69         /* clean qtds first, and know this is not linked */
>  70         if (!list_empty (&qh->qtd_list) || qh->qh_next.ptr) {
>  71                 ehci_dbg (ehci, "unused qh not empty!\n");
>  72                 BUG ();
>  73         }

> and finally a dump of the ehci_qh in question:
> 
> crash> struct ehci_qh ffff88084b84dc80
> struct ehci_qh {
>   hw = 0xffff880078d1a000, 

It would be good to see the contents of the ehci_qh_hw structure.  That 
would tell us what device and endpoint this QH was for.

>   qh_dma = 0x78d1a000, 
>   qh_next = {
>     qh = 0xffff88084efe6730, 
>     itd = 0xffff88084efe6730, 
>     sitd = 0xffff88084efe6730, 
>     fstn = 0xffff88084efe6730, 
>     hw_next = 0xffff88084efe6730, 
>     ptr = 0xffff88084efe6730                     << !NULL
>   }, 
>   qtd_list = {                                   << list_empty
>     next = 0xffff88084b84dc98, 
>     prev = 0xffff88084b84dc98
>   }, 
>   intr_node = {
>     next = 0x0, 
>     prev = 0x0
>   }, 
>   dummy = 0xffff880078d22000, 
>   unlink_node = {
>     next = 0xffff88084b84dcc0, 
>     prev = 0xffff88084b84dcc0
>   }, 
>   unlink_cycle = 0x0, 
>   qh_state = 0x1,                                << QH_STATE_LINKED
...
> }
> 
> The qtd_list is empty, contains only one entry, itself.
> 
> crash> struct -o ehci_qh | grep td_list
>   [0x18] struct list_head qtd_list;
> crash> p/x 0xffff88084b84dc80 + 0x18
> $1 = 0xffff88084b84dc98
> 
> but qh->qh_next.ptr is !NULL, so we hit the BUG.  However, it seems that
> the memory at qh->qh_next.ptr has been freed:

> I'm not too familiar with the USB code stack, so any suggestions on
> instrumentation that I can add to aid in debugging would be helpful.
> Maybe some tracing in qh_link_async / single_unlink_async /
> end_unlink_async /qh_link_periodic can reveal the sequence that is
> leaving this dangling qh_next.ptr?

The place to look is ehci_endpoint_disable.  Did that routine get 
called for this QH?  Did it hit the default case of the big switch 
statement (with its ehci_err statement)?

> Note: This does bear some resemblance to a bug that Stratus hit a few
> years ago [1] [2], however enough of the code has changed that I'm not
> sure the fix for that one would apply to a modern kernel.

What version of the driver are you currently running?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux