On Tue, 26 Mar 2013, Noone Nowhere wrote: > Hello again Alan, we read that the patch broke. What patch? The one that works around the Intel/AMD hardware problem? Yes, it had a mistake, which has now been fixed. (Although the fix has not yet been released in a 3.8.stable kernel.) > Damn, our fear was > justified. If we are right, it will break again because it affects > various > devices using different busses, each one with its limitations and > defects. That reasoning doesn't make sense. Practically every line of code in ehci-hcd.c and its associated files affects various devices using different buses. If your fear was justified, the driver would never work at all. > That's why we wrote to Sarah(and got no reply yet) that Your message to Sarah was little more than a rant about issues that are out of her control and took place long before she started working at Intel. Surely you didn't expect her to have a constructive reply? > vendors > should fix their h/w bugs in-house or give open source developers full > documentation (like nda datasheets,bios spec,bios spec up) > to cope with their >buggy< hardware! Well, it's questionable whether this really should be called a bug. Clearly it is undesirable behavior, but strictly speaking it is not in violation of the EHCI spec. > > Probably it is cured. But something is still wrong, even though it may > > be unrelated. > > Since something is broken and the last program tested exits at A50M, > we have some questions for today prior testing A50M, moschip and > ICH4M: > > 1)Do we need to apply a third > patch(http://marc.info/?l=linux-usb&m=136379040918329&w=2) [it's upper > part to be specific]? Are you considering the hardware workaround and the bug-detection patch as the first two? Then yes, a third patch is needed, but not the one you mentioned. The third patch is here: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit?id=d714aaf649460cbfd5e82e75520baa856b4fa0a0 Oddly, I can't find the patch submission anywhere in the email archives. Maybe I forgot to CC: linux-usb when sending it in. > If yes, please make > a cumulative patch for 3.8 series before we mess things up and post it > under this subject. It is below. > At the same post, add the final test program with > program version number printing as well as a "do NOT unplug the usb > and a leave the program to go up to 1000" message. Also point out that > the > device must not be inside the unusual devs file as you told us. This > will make it easier for others to follow and help you. You can easily make these changes and post the result yourself. > 2)If A50M still fails, do you have time to analyze our usbmon output? Yes. > 3)Has any type of transfer corruption been observed from this bug? Not as far as I know. Alan Stern Index: 3.8/drivers/usb/host/ehci-q.c =================================================================== --- 3.8.orig/drivers/usb/host/ehci-q.c +++ 3.8/drivers/usb/host/ehci-q.c @@ -547,7 +547,7 @@ qh_completions (struct ehci_hcd *ehci, s if (stopped != 0 || hw->hw_qtd_next == EHCI_LIST_END(ehci)) { switch (state) { case QH_STATE_IDLE: - qh_refresh(ehci, qh); +// qh_refresh(ehci, qh); break; case QH_STATE_LINKED: /* We won't refresh a QH that's linked (after the HC @@ -1170,7 +1170,7 @@ static void single_unlink_async(struct e struct ehci_qh *prev; /* Add to the end of the list of QHs waiting for the next IAAD */ - qh->qh_state = QH_STATE_UNLINK; + qh->qh_state = QH_STATE_UNLINK_WAIT; if (ehci->async_unlink) ehci->async_unlink_last->unlink_next = qh; else @@ -1213,9 +1213,19 @@ static void start_iaa_cycle(struct ehci_ /* Do only the first waiting QH (nVidia bug?) */ qh = ehci->async_unlink; - ehci->async_iaa = qh; - ehci->async_unlink = qh->unlink_next; - qh->unlink_next = NULL; + + /* + * Intel (?) bug: The HC can write back the overlay region + * even after the IAA interrupt occurs. In self-defense, + * always go through two IAA cycles for each QH. + */ + if (qh->qh_state == QH_STATE_UNLINK_WAIT) { + qh->qh_state = QH_STATE_UNLINK; + } else { + ehci->async_iaa = qh; + ehci->async_unlink = qh->unlink_next; + qh->unlink_next = NULL; + } /* Make sure the unlinks are all visible to the hardware */ wmb(); @@ -1232,6 +1242,7 @@ static void start_iaa_cycle(struct ehci_ static void end_unlink_async(struct ehci_hcd *ehci) { struct ehci_qh *qh; + __hc32 tok1, tok2; if (ehci->has_synopsys_hc_bug) ehci_writel(ehci, (u32) ehci->async->qh_dma, @@ -1242,6 +1253,7 @@ static void end_unlink_async(struct ehci ehci->async_unlinking = true; while (ehci->async_iaa) { qh = ehci->async_iaa; + tok1 = ACCESS_ONCE(qh->hw->hw_token); ehci->async_iaa = qh->unlink_next; qh->unlink_next = NULL; @@ -1250,8 +1262,14 @@ static void end_unlink_async(struct ehci qh_completions(ehci, qh); if (!list_empty(&qh->qtd_list) && - ehci->rh_state == EHCI_RH_RUNNING) + ehci->rh_state == EHCI_RH_RUNNING) { + udelay(10); + tok2 = ACCESS_ONCE(qh->hw->hw_token); + if (tok1 != tok2) + ehci_err(ehci, "EHCI hardware bug detected: %08x %08x\n", + tok1, tok2); qh_link_async(ehci, qh); + } disable_async(ehci); } ehci->async_unlinking = false; Index: 3.8/drivers/usb/host/ehci-hcd.c =================================================================== --- 3.8.orig/drivers/usb/host/ehci-hcd.c +++ 3.8/drivers/usb/host/ehci-hcd.c @@ -748,11 +748,9 @@ static irqreturn_t ehci_irq (struct usb_ /* guard against (alleged) silicon errata */ if (cmd & CMD_IAAD) ehci_dbg(ehci, "IAA with IAAD still set?\n"); - if (ehci->async_iaa) { + if (ehci->async_iaa) COUNT(ehci->stats.iaa); - end_unlink_async(ehci); - } else - ehci_dbg(ehci, "IAA with nothing unlinked?\n"); + end_unlink_async(ehci); } /* remote wakeup [4.3.1] */ Index: 3.8/drivers/usb/host/ehci-timer.c =================================================================== --- 3.8.orig/drivers/usb/host/ehci-timer.c +++ 3.8/drivers/usb/host/ehci-timer.c @@ -305,7 +305,7 @@ static void ehci_iaa_watchdog(struct ehc * (a) SMP races against real IAA firing and retriggering, and * (b) clean HC shutdown, when IAA watchdog was pending. */ - if (ehci->async_iaa) { + if (1) { u32 cmd, status; /* If we get here, IAA is *REALLY* late. It's barely -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html