Hi
On 04/07/2014 09:12 PM, Eric Gross wrote:
Hi all,
I am implementing a driver (currently libusb-based, but may change to
kernel-based eventually) for a USB standard class type that makes use of
endpoint stalling as a synchronization mechanism to recover after error
conditions between device and host (the reasons for needing it are a bit
complex). The driver code I have been using works beautifully on Windows,
some embedded OSes with proprietary USB stacks, and Linux via the EHCI
driver. However, I ran into problems as soon as we started using this
driver on XHCI systems (based off the 3.10 kernel).
The sequence the driver typically does when encountering an error (or
thinking it needs to resync) is:
- Abort any pending URBs (may be several queued to the EP)
- Set Feature(HALT)
- Clear EP Stall
- Continue
What we saw with a bus analyzer was that, independent of host controller
used (tested Intel and Renesas), the sequence number of the next outgoing
packet (or toggle bit when in High Speed mode) was incorrect after
clearing the stall. The device resets its expected sequence/toggle after
un-stalling the EP and hence it ignores the next packet with the incorrect
one. Interestingly, some devices are actually tolerant of this behavior
and accept the incorrect sequence id, but any devices based on the Cypress
FX3 (a large number of devices implementing this class type) fail.
When researching this issue I saw a number of previous posts hinting at
known issues like this, but I have not seen a firm conclusion. It seems
that some of the early responses by Sarah Sharp indicate that it is
working this way by design (I admit I am not an expert in the XHCI spec).
I see some newer posts referencing a "clear halt bug", but I have been
unable to find what this definitively is referencing. Based on my
experience with how every other stack appears to work (including the Linux
EHCI driver) and how the device is supposed to behave when it gets the
clear stall request, I can't help but think that the behavior as it
currently is is wrong.
The issue we currently have is that the xHCI (both driver and hw)
refuses to reset an endpoint if it's not halted.
SetFeature(ENDPOINT_HALT) will set the device to halted state, but it
requires some additional transfer that returns STALL until xHCI will see
the endpoint as halted.
So in this case the situation is:
Abort pending urbs
SetFeature(ENDPOINT_HALT)
- ep halted on device side, xHCI doesn't consider ep halted.
usb_clear_halt()
- ClearFeature(ENDPOINT_HALT) -> device resets its ep toggle/sequence
- call hcd->driver->endpoint_reset(), but the xhci .endpoint_reset()
callback can't reset an endpoint it doesn't consider halted.
xhci host side toggle/sequence are not reset -> mismatch.
I can provide any additional information (bus traces, testing results,
etc) as needed. If this is a known issue that someone can point me to the
bugzilla entry for (I have been unsuccessful finding one) or some previous
discussion threads I may have not found, it would be appreciated as well.
With dynamic debugging enabled for xhci you should probably see:
"Endpoint x not halted, refusing to reset."
Discussion threads touching this topic:
http://marc.info/?l=linux-usb&m=134922286125585&w=2
http://marc.info/?l=linux-usb&m=134852269014614&w=2
http://marc.info/?l=linux-usb&m=139025060301432&w=2
I'm focusing on this issue right now, and I appreciate if you are able
to run some test with your setup once I get something ready.
The main thing that needs to be done is what xHCI specs states
in an additional Note added to section 4.6.8 :
" If software wishes reset the Data Toggle or Sequence Number of an
endpoint that isn't in the Halted state, then software may issue a
Configure Endpoint Command with the Drop and Add bits set for the
target endpoint." But some other tweaking to how xhci driver handles
STALL and clears halted endpoints is also needed.
-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html