Re: Testing for hardware bug in EHCI controllers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 25.02.2013 21:54, schrieb Alan Stern:
> Sarah (and anyone else who's interested):
>
> A while ago I wrote about a hardware bug in my Intel ICH5 and ICH8 EHCI
> controllers.  You pointed out that these are rather old components, not 
> being used in current systems, which is quite true.
>
> Now I have figured out a simple way for anyone to test for this bug in
> any EHCI controller, without the need for a g-zero gadget.  It's a
> two-part procedure:
>
> 	Apply the patch below (which is written for vanilla 3.8) and
> 	load the resulting driver.  The patch adds an explicit test
> 	to ehci-hcd for detecting the bug.
>
> 	Then plug in an ordinary USB flash drive and run the attached
> 	program (as root), giving it the device path for the flash
> 	drive as the single command-line argument.  For example:
>
> 		sudo ./ehci-test /dev/bus/usb/002/003
>
> The program won't do anything bad to the flash drive; it just reads the
> first 256 KB of data over and over again, now and then unlinking an URB
> to try and trigger the bug.  If the program works right, it will print
> out a loop counter every hundred iterations.  If it runs for 1000
> iterations with no error messages in the kernel log, you may consider
> that the controller has passed the test.  This should take under a
> minute, depending on the hardware speed.
>
> The program won't stop by itself unless something goes wrong.  You can
> kill it with ^C or more simply by unplugging the flash drive.  (If you
> want to be safe, make sure there are no mounted filesystems on the
> drive before running the test program.)
>
> If the hardware bug is detected, the kernel patch will print error
> messages to the system log.  For example, when I run the test on the
> Intel controller in this computer, I get:
>
> [  150.019441] usb-storage 3-8:1.0: disconnect by usbfs
> [  150.271190] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00
> [  150.591089] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00
> [  151.538560] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00
> [  151.857569] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00
> [  152.018886] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00
> [  152.179810] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 80008d00 00008d00
> [  153.211804] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00
> [  153.374497] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00
> [  153.770443] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 80008d00 00008d00
> [  154.247861] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 82008d80 00008d00
> [  154.566912] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 82008d80 00008d00
> [  155.359101] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00
> [  155.838132] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00
> [  156.791107] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 80008d00 00008d00
> [  157.267620] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 00008d00 80008d00
> [  159.252057] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 80008d00 00008d00
> [  159.886048] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 80008d00 00008d00
> [  160.206625] ehci-pci 0000:00:1d.7: EHCI hardware bug detected: 02008d80 80008d00
> ...
>
> You get the idea.  The values in the two columns on the right are 
> always supposed to be equal; when they aren't it indicates that the 
> controller has done a DMA write at a time when ehci-hcd isn't expecting 
> one to happen.
>
> I'd be interested to hear the results of testing on a variety of 
> controllers.  (This computer also has an NEC EHCI controller, and that 
> one does not have the bug.)  Do the EHCI controllers on current Intel 
> chipsets pass the test?  What about other vendors?
>
> Thanks to all who try it out and report their results.
>
> Alan Stern

Here is the result of your test procedure (fix applied, running kernel
3.9-rc1) for the following device:

00:02.1 USB controller [0c03]: NVIDIA Corporation MCP61 USB 2.0
Controller [10de:03f2] (rev a2) (prog-if 20 [EHCI])
        Subsystem: ASUSTeK Computer Inc. Device [1043:8234]
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0 (750ns min, 250ns max)
        Interrupt: pin B routed to IRQ 22
        Region 0: Memory at fe02e000 (32-bit, non-prefetchable) [size=256]
        Capabilities: [44] Debug port: BAR=1 offset=0098
        Capabilities: [80] Power Management version 2
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Kernel driver in use: ehci-pci


=> dmesg output:
[  207.965961] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
80008d00 00008d00
[  208.020904] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
82008d80 00008d00
[  208.198698] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
80008d00 00009d00
[  208.201699] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
82008d80 00009d00
[  208.227968] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
00008d00 80008d00
[  208.230453] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
80008d00 00008d00
[  208.264518] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
00008d00 80008d00
[  208.287447] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
02008d80 80008d00
[  208.398602] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
82008d80 00008d00
[  208.406755] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
80008d00 00008d00
[  208.456527] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
02008d80 80008d00
[  208.460998] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
80008d00 00009d00
[  208.497597] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
80008d00 00008d00
[  208.556599] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
82008d80 00008d00
[  208.560598] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
80008d00 00009d00
[  208.563607] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
82008d80 00008d00
[  208.651304] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
00008d00 80008d00
[  208.692580] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
82008d80 00008d00
[  208.692936] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
82008d80 00008d00
[  208.788225] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
80008d00 82008d80
[  208.831607] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
80008d00 00008d00
[  208.851212] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
80008d00 00008d00
[  208.862448] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
82008d80 00008d00
[  208.919208] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
02008d80 80009d00
[  208.968208] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
82008d80 00009d00
[  208.980844] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
02008d80 80008d00
[  209.138456] ehci-pci 0000:00:02.1: EHCI hardware bug detected:
82008d80 00008d00


Regards,
Frank


--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux