Need help for hardware bug in Intel's EHCI implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sarah:

I just tracked down a tricky problem, which appears to be caused by a
genuine hardware bug.  It's hard to believe this has escaped everyone's
notice for so many years -- maybe my results are wrong.  But as far as
I can tell, they aren't.

Anyway, I don't know what to do about it.  Can you forward this message 
to an appropriate person at Intel?

Thanks.



My system has a fairly old 82801EB/ER motherboard with the ICH5/ICH5R 
chipset, as shown by lspci:

$ lspci -v -s 1d.7
00:1d.7 USB Controller: Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 
EHCI Controller (rev 02) (prog-if 20 [EHCI])
        Subsystem: GVC/BCM Advanced Research Device 2181
        Flags: bus master, medium devsel, latency 0, IRQ 23
        Memory at fe77bc00 (32-bit, non-prefetchable) [size=1K]
        Capabilities: <access denied>
        Kernel driver in use: ehci_hcd

$ lspci -vn -s 1d.7
00:1d.7 0c03: 8086:24dd (rev 02) (prog-if 20 [EHCI])
        Subsystem: 14a4:2181
        Flags: bus master, medium devsel, latency 0, IRQ 23
        Memory at fe77bc00 (32-bit, non-prefetchable) [size=1K]
        Capabilities: <access denied>
        Kernel driver in use: ehci_hcd

My test results show that the EHCI hardware generates Interrupt on 
Async Advance (IAA) interrupts too early.

The test is fairly simple.  The async schedule contains nothing but an
empty QH (with the Head bit set) and a QH for a bulk-OUT endpoint.  A
bunch of 512-byte qTD's are added to the bulk QH, and at various times
some of them are cancelled.

This cancellation involves removing the QH from the async schedule.  
The driver uses the scheme outlined in section 4.8.2 of the EHCI spec. 
The Horizontal Pointer in the empty QH is set to point back to itself,
then the Interrupt on Async Advance Doorbell (IAAD) bit is set in the
USBCMD register, and then the driver waits for an interrupt with the
IAA bit set in the USBSTS register.

When the interrupt arrives, the driver stores the values of the qTD
Token words from both the bulk QH's overlay region and the first qTD
whose Active bit is still set.  After the cancelled qTD's have been
removed from the list, the driver inserts the bulk QH back into the
async schedule.

Just before doing this, the driver compares the Token words currently
in the overlay region and the first qTD with the stored values from
earlier, and it logs a message if they are different.  This should
never happen, because the controller should not be able to access the
bulk QH after the IAA interrupt.

Nevertheless, I do see occasional changes in the Token words.  It is
obvious that the controller sometimes writes back values for a
completed transaction to the overlay or to the qTD _after_ the IAA
interrupt has been received.  In the tests, the earlier Token values
are 0x02008c80 and the later values are 0x80008c00.  (Oddly enough,
this seems to happen only when the Data Toggle bit changes from 0 to
1.)

Assuming the test results are correct, this is clearly a hardware bug,
and a nasty one.  Is it a known problem?  Is there a recommended
workaround?

Thank you,

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux