2013/2/26 Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>: > Sarah (and anyone else who's interested): > > A while ago I wrote about a hardware bug in my Intel ICH5 and ICH8 EHCI > controllers. You pointed out that these are rather old components, not > being used in current systems, which is quite true. > > Now I have figured out a simple way for anyone to test for this bug in > any EHCI controller, without the need for a g-zero gadget. It's a > two-part procedure: > > Apply the patch below (which is written for vanilla 3.8) and > load the resulting driver. The patch adds an explicit test > to ehci-hcd for detecting the bug. > > Then plug in an ordinary USB flash drive and run the attached > program (as root), giving it the device path for the flash > drive as the single command-line argument. For example: > > sudo ./ehci-test /dev/bus/usb/002/003 > > The program won't do anything bad to the flash drive; it just reads the > first 256 KB of data over and over again, now and then unlinking an URB > to try and trigger the bug. If the program works right, it will print > out a loop counter every hundred iterations. If it runs for 1000 > iterations with no error messages in the kernel log, you may consider > that the controller has passed the test. This should take under a > minute, depending on the hardware speed. > > The program won't stop by itself unless something goes wrong. You can > kill it with ^C or more simply by unplugging the flash drive. (If you > want to be safe, make sure there are no mounted filesystems on the > drive before running the test program.) > > If the hardware bug is detected, the kernel patch will print error > messages to the system log. For example, when I run the test on the > Intel controller in this computer, I get: > > [ 150.019441] usb-storage 3-8:1.0: disconnect by usbfs > [ 150.271190] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00 > [ 150.591089] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00 > [ 151.538560] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00 > [ 151.857569] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00 > [ 152.018886] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00 > [ 152.179810] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 80008d00 00008d00 > [ 153.211804] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00 > [ 153.374497] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00 > [ 153.770443] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 80008d00 00008d00 > [ 154.247861] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 82008d80 00008d00 > [ 154.566912] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 82008d80 00008d00 > [ 155.359101] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00 > [ 155.838132] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00 > [ 156.791107] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 80008d00 00008d00 > [ 157.267620] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00 > [ 159.252057] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 80008d00 00008d00 > [ 159.886048] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 80008d00 00008d00 > [ 160.206625] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 02008d80 80008d00 > ... > > You get the idea. The values in the two columns on the right are > always supposed to be equal; when they aren't it indicates that the > controller has done a DMA write at a time when ehci-hcd isn't expecting > one to happen. > > I'd be interested to hear the results of testing on a variety of > controllers. (This computer also has an NEC EHCI controller, and that > one does not have the bug.) Do the EHCI controllers on current Intel > chipsets pass the test? What about other vendors? > > Thanks to all who try it out and report their results. Test on the Sandybridge platform. At the first time, I get following output. But after that, I was hard to get any output. And test on the v3.8. sudo ./ehci-test /dev/bus/usb/001/003 [ 140.855342] usb-storage 1-1.2:1.0: disconnect by usbfs Invalid URB stat[ 140.863000] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545f300 ep1in-bulk [ 140.871303] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545f0c0 ep1in-bulk [ 140.878231] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545fcc0 ep1in-bulk [ 140.885158] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545fb40 ep1in-bulk [ 140.892088] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545f9c0 ep1in-bulk [ 140.899015] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545f780 ep1in-bulk [ 140.905941] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545f240 ep1in-bulk [ 140.912870] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545f900 ep1in-bulk [ 140.919799] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545fc00 ep1in-bulk [ 140.926725] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545f540 ep1in-bulk [ 140.933655] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545f3c0 ep1in-bulk [ 140.940583] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545fd80 ep1in-bulk [ 140.947512] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545f600 ep1in-bulk [ 140.954440] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545f180 ep1in-bulk [ 140.961368] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545f000 ep1in-bulk [ 140.968297] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545fa80 ep1in-bulk [ 140.975223] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545f840 ep1in-bulk us -32, act len [ 140.982151] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545fe40 ep1in-bulk [ 140.990459] ehci-pci 0000:00:1a.0: shutdown urb ffff88014545ff00 ep1in-bulk [ 140.997388] ehci-pci 0000:00:1a.0: shutdown urb ffff880145f08000 ep1in-bulk [ 141.004316] ehci-pci 0000:00:1a.0: shutdown urb ffff880145f080c0 ep1in-bulk [ 141.011245] ehci-pci 0000:00:1a.0: shutdown urb ffff880145f08180 ep1in-bulk > > Alan Stern > > > > > Index: usb-3.8/drivers/usb/host/ehci-q.c > =================================================================== > --- usb-3.8.orig/drivers/usb/host/ehci-q.c > +++ usb-3.8/drivers/usb/host/ehci-q.c > @@ -547,7 +547,7 @@ qh_completions (struct ehci_hcd *ehci, s > if (stopped != 0 || hw->hw_qtd_next == EHCI_LIST_END(ehci)) { > switch (state) { > case QH_STATE_IDLE: > - qh_refresh(ehci, qh); > +// qh_refresh(ehci, qh); > break; > case QH_STATE_LINKED: > /* We won't refresh a QH that's linked (after the HC > @@ -1232,6 +1232,7 @@ static void start_iaa_cycle(struct ehci_ > static void end_unlink_async(struct ehci_hcd *ehci) > { > struct ehci_qh *qh; > + __hc32 tok1, tok2; > > if (ehci->has_synopsys_hc_bug) > ehci_writel(ehci, (u32) ehci->async->qh_dma, > @@ -1242,6 +1243,7 @@ static void end_unlink_async(struct ehci > ehci->async_unlinking = true; > while (ehci->async_iaa) { > qh = ehci->async_iaa; > + tok1 = ACCESS_ONCE(qh->hw->hw_token); > ehci->async_iaa = qh->unlink_next; > qh->unlink_next = NULL; > > @@ -1250,8 +1252,14 @@ static void end_unlink_async(struct ehci > > qh_completions(ehci, qh); > if (!list_empty(&qh->qtd_list) && > - ehci->rh_state == EHCI_RH_RUNNING) > + ehci->rh_state == EHCI_RH_RUNNING) { > + udelay(10); > + tok2 = ACCESS_ONCE(qh->hw->hw_token); > + if (tok1 != tok2) > + ehci_err(ehci, "EHCI hardware bug detected: %08x %08x\n", > + tok1, tok2); > qh_link_async(ehci, qh); > + } > disable_async(ehci); > } > ehci->async_unlinking = false; -- Best regards Tianyu Lan -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html