Re: Hitting "unused qh not empty" BUG in qh_destroy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 12 Sep 2014 11:31:46 -0400
Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:

> On Thu, 11 Sep 2014, Joe Lawrence wrote:
> 
> > Hi Alan,
> > 
> > I've got another USB bug to report that manifests during automated
> > device removal testing on RHEL7.  This one hits the BUG() inside
> > qh_destroy:
> 
> How reliably can you trigger this bug?

I have collected a few crashes within a few days, so somewhat
frequently.

> >  67 static void qh_destroy(struct ehci_hcd *ehci, struct ehci_qh *qh)
> >  68 {
> >  69         /* clean qtds first, and know this is not linked */
> >  70         if (!list_empty (&qh->qtd_list) || qh->qh_next.ptr) {
> >  71                 ehci_dbg (ehci, "unused qh not empty!\n");
> >  72                 BUG ();
> >  73         }
> 
> > and finally a dump of the ehci_qh in question:
> > 
> > crash> struct ehci_qh ffff88084b84dc80
> > struct ehci_qh {
> >   hw = 0xffff880078d1a000, 
> 
> It would be good to see the contents of the ehci_qh_hw structure.  That 
> would tell us what device and endpoint this QH was for.

crash> struct ehci_qh_hw 0xffff880078d1a000
struct ehci_qh_hw {
  hw_next = 0x78d1a062, 
  hw_info1 = 0x8000, 
  hw_info2 = 0x0, 
  hw_current = 0x0, 
  hw_qtd_next = 0x1, 
  hw_alt_next = 0x78d22000, 
  hw_token = 0x40, 
  hw_buf = {0x0, 0x0, 0x0, 0x0, 0x0}, 
  hw_buf_hi = {0x0, 0x0, 0x0, 0x0, 0x0}
}

> >   qh_dma = 0x78d1a000, 
> >   qh_next = {
> >     qh = 0xffff88084efe6730, 
> >     itd = 0xffff88084efe6730, 
> >     sitd = 0xffff88084efe6730, 
> >     fstn = 0xffff88084efe6730, 
> >     hw_next = 0xffff88084efe6730, 
> >     ptr = 0xffff88084efe6730                     << !NULL
> >   }, 
> >   qtd_list = {                                   << list_empty
> >     next = 0xffff88084b84dc98, 
> >     prev = 0xffff88084b84dc98
> >   }, 
> >   intr_node = {
> >     next = 0x0, 
> >     prev = 0x0
> >   }, 
> >   dummy = 0xffff880078d22000, 
> >   unlink_node = {
> >     next = 0xffff88084b84dcc0, 
> >     prev = 0xffff88084b84dcc0
> >   }, 
> >   unlink_cycle = 0x0, 
> >   qh_state = 0x1,                                << QH_STATE_LINKED
> ...
> > }
> > 
> > The qtd_list is empty, contains only one entry, itself.
> > 
> > crash> struct -o ehci_qh | grep td_list
> >   [0x18] struct list_head qtd_list;
> > crash> p/x 0xffff88084b84dc80 + 0x18
> > $1 = 0xffff88084b84dc98
> > 
> > but qh->qh_next.ptr is !NULL, so we hit the BUG.  However, it seems that
> > the memory at qh->qh_next.ptr has been freed:
> 
> > I'm not too familiar with the USB code stack, so any suggestions on
> > instrumentation that I can add to aid in debugging would be helpful.
> > Maybe some tracing in qh_link_async / single_unlink_async /
> > end_unlink_async /qh_link_periodic can reveal the sequence that is
> > leaving this dangling qh_next.ptr?
> 
> The place to look is ehci_endpoint_disable.  Did that routine get 
> called for this QH?  Did it hit the default case of the big switch 
> statement (with its ehci_err statement)?

Not sure if there is enough residual side-effect data in a crash dump
to determine if ehci_endpoint_disable executed.  However, the QH that
qh_destroy was handling did *not* have the exception bit set.  (See the
first mail for the structure dump.)

Would it be reasonable to add printk debugging messages to
ehci_endpoint_disable to trace the QH in question and its qh_state?

> > Note: This does bear some resemblance to a bug that Stratus hit a few
> > years ago [1] [2], however enough of the code has changed that I'm not
> > sure the fix for that one would apply to a modern kernel.
> 
> What version of the driver are you currently running?

The driver is built into a slightly modified RHEL7 3.10.0-123.6.3.el7.x86_64 kernel.

Regards,

-- Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux