On Wed, Aug 30, 2023, Alan Stern wrote: > On Wed, Aug 30, 2023 at 01:32:28AM +0000, Thinh Nguyen wrote: > > That reminds me another thing, if the host (xhci in this case) does a > > hard reset to the endpoint, it also resets the TRB pointer with dequeue > > ep command. So, the transfer should not resume. It needs to be > > cancelled. This xHCI behavior is the same for Windows and Linux. > > That's on the host side, right? How does this affect the gadget side? > > That is, cancelling a transfer on the host doesn't necessarily mean it > has to be cancelled on the gadget. Does it have any implications at all > for the gadget driver? There are 2 things that needs to be in sync'ed between host and device: 1) The data sequence. 2) The transfer. If host doesn't send CLEAR_FEATURE(halt_ep), best case scenario, the data sequence does't match and the host issues usb reset after some timeout because the packet won't go through. Worst case scenario, the data sequence matches 0, and the wrong data is received causing corruption. If the device doesn't cancel the transfer in response to CLEAR_FEATURE(halt_ep), it may send/receive data of a different transfer because the host doesn't resume where it left off, causing corruption. Base on the class protocol, the class driver and gadget driver know what makes up a "transfer" and can appropriately cancel a transfer to stay in sync. > > > > I think it should be the opposite; the class protocol should specify > > > how to recover from errors. If for no other reason then to avoid the > > > data duplication problem for USB-2. However, if it doesn't specify a > > > recovery procedure then there's not much else you can do. > > > > Right, unfortunately that's not always the case that class protocol > > spell out how to handle transaction error. > > All too true... > > > > But regardless, how can the gadget driver make any use of the > > > knowledge that the UDC received a Clear-Halt? What would it do > > > differently? If the intent is simply to clear an error condition and > > > continue with the existing transfer, the gadget driver doesn't need to > > > do anything. > > > > It's not simple to clear an error. It is to notify the gadget driver to > > cancel the active transfer and resync with the host. > > How does the gadget driver sync with the host if the class protocol > doesn't say what should be done? > > Also, what if there is no active transfer? That is, what if the > transaction that got an error on the host appeared to be successful on > the gadget and it was the last transaction in the final transfer queued > for the endpoint? How would the UDC driver notify the gadget driver in > this situation? That's fine. If there's no active transfer, the gadget doesn't need to cancel anything. As long as the host knows that the transfer did not complete, it can retry and be in sync. For UASP, the host will send a new MSC command to retry the failed transfer. ie. The host would overwrite/re-read the transfer with the same transfer offset. The problem arises if the gadget attempts to resume the incomplete transfer. > > > This is observed in > > UASP driver in Windows and how various consumer UASP devices handle it. > > I don't understand what you're saying here. How can you observe whether > a transfer is cancelled in a consumer UAS device? And how does the > consumer device resync with the host? You can see a hang if the transfer are out of sync. If the transfer isn't cancelled, the device would only source/sink whatever the remaining of the previous transfer but not enough to complete the new transfer. The new transfer is seen as incomplete from host and thus the hang and the usb reset. > > > There no eqivalent of Bulk-Only Mass Storage Reset request from the > > class protocol. We still have the USB analyzer traces for this. > > Can you post an example? Not necessarily in complete detail, but enough > so that we can see what's going on. > > > Regardless whether the class protocol spells out how to handle the > > transaction error, if there's transaction error, the host may send > > CLEAR_FEATURE(halt_ep) as observed in Windows. The gadget driver needs > > to know about it to cancel the active transfer and resync with the host. > > I'll be able to understand this better after seeing an example. Do you > have any traces that were made for a High-speed connection (say, using > a USB-2 cable)? It would probably be easier to follow than a SuperSpeed > example. > Unfortunately I only have LeCroy usb analyzer traces of Gen 2x1, not for usb2 speed. It's a bit tricky converting it to text with all the proper info to see all the context. If my explanation isn't clear, I'll try to figure out how to proceed. Thanks, Thinh