On Fri, Mar 04, 2022 at 01:43:00AM +0000, Thinh Nguyen wrote:
Michael Grzeschik wrote:On Wed, Mar 02, 2022 at 10:22:55PM +0000, Thinh Nguyen wrote:Michael Grzeschik wrote:I am currently debugging a strange error with transferring isoc requests with the current mainline driver. When the pending and started list is empty the complete interrupt handler is sending an End Transfer to the controller. This End Transfer will then get successfully completed. irq/50-dwc3-305 [000] d..1. 166.661845: dwc3_gadget_ep_cmd: ep5in: cmd 'End Transfer' [b0d08] params 00000000 00000000 00000000 --> status: Successful irq/50-dwc3-305 [000] d..1. 166.661877: dwc3_event: event (080b01d6): ep5in: Endpoint Command Complete irq/50-dwc3-305 [000] d..1. 166.661979: dwc3_event: event (482a10d6): ep5in: Transfer Not Ready [0000482a] (Not Active) After this the Transfer needs to start again. But unfortunately this sometimes does not succeed. You see in the following trace that the Start of the just prepared trbs did complete with an -18. kworker/u5:2-347 [001] d..2. 166.691288: dwc3_prepare_trb: ep5in: trb 000000007ff95a12 (E233:D232) buf 0000ffffffb1c2a8 size 1x 12 ctrl 00000465:00000000 (HlCs:Sc:isoc-first) kworker/u5:2-347 [001] d..2. 166.691293: dwc3_prepare_trb: ep5in: trb 00000000f9cd175e (E234:D232) buf 0000ffffffb1d000 size 1x 7156 ctrl 00000471:00000000 (Hlcs:Sc:isoc)[snip]The intermediate Bus Expiry is not always present in that case, but also completely acceptable as we loop over the Start Transfer until this succeeds. Do you have any Idea why this succeeded Start Streaming does come back with an Missed Transfer on the first trb?Hi Michael, When sending the trace, please leave it unfiltered. It's missing some trace events that I want to check.Right, it was a mix of trace_event:dwc3:* and ftrace_filter=dwc*. So I kind of felt filtering out the interesting part would help. Anyway. Here is the same situation in another trace, with all trace_events from dwc3:<snip>Please check if the host is actually polling for data at that uframe. Each TRB is scheduled for a particular interval. If the host doesn't request for data at that particular interval, then it's a missed isoc also (err -18). So, if you have a usb traffic analyzer, please verify this. Also, maybe try with a different host vendor to see if the behavior is the same.The behavior seems to differ if I attach different hosts. I have no traffic analyzer. Would usbmon under linux host work for that? How wouldusbmon doesn't see usb packet level. So no, usbmon doesn't work.I synchronize the gadget request and host urbs to know which frame interval would have been the one that was missed in my logs?It will be difficult. You'd have to check the expected interval the urb was scheduled for from xhci driver, and you may have to check the MFINDEX. You'd also need to determine a point of reference that the data received from device is for a particular interval. Even if you have all this info, you'd have to assume everything the host controller reported is accurate and not missing anything.Is there any way the gadget driver could see if for that interval the host did not poll for data? OR are there bits to get this behaviour predictable and the driver to react on?You can check if there any "missed service error" from host, it may give some clues, but not the whole picture.Otherwise, why should on an ongoing stream the host not poll for data in any interval. What would be the constraints to ignore one interval.The host driver may not intentionally does it. There are many reasons where the host controller not poll for data, and I've seen it often. It can be because of the driver handling interrupt latency/delay that prevents the driver from feeding isoc TD/TRBs to the controller in time, causing underflow and starve the controller. Some other time it's the controller that decided the bandwidth is not sufficient for isoc transfer or phy signal instability and cause a missed service error etc. Anyway, in this particular case, from my experience, I don't think it's the host that's missing a poll. It's probably because the device scheduling it too aggressively early. Can you try to schedule on the 2nd interval instead and see if it helps in your case. diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 3e75eaa13abc..b9ec7626b080 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -2104,7 +2104,7 @@ static int __dwc3_gadget_start_isoc(struct dwc3_ep *dep) } for (i = 0; i < DWC3_ISOC_MAX_RETRIES; i++) { - dep->frame_number = DWC3_ALIGN_FRAME(dep, i + 1); + dep->frame_number = DWC3_ALIGN_FRAME(dep, i + 2); ret = __dwc3_gadget_kick_transfer(dep); if (ret != -EAGAIN)
I think I found the case: From my datasheet: <quote> After a transfer has been started, the hardware will perform the following functions for IN endpoints: 1. Fetch TX data as early as one interval prior to the beginning of the interval (call it A) if the HWO bit is set to one in the TRB. 2. Decrement the buffer size of each TRB as packets are transmitted. 3. Retire TRBs when their buffer size has reached 0, issuing an XferInProgress event if the IOC bit is set. 4. If the next interval (B) starts before all the packets have been transmitted for interval A: a. Flush the TXFIFO. b. Retire the Buffer Descriptor of interval A with a "Missed Isochronous" status. c. Retire the Buffer Descriptor of interval B with a "Missed Isochronous" status. d. See Checking interval status for a description of how software can determine that an interval ended unexpectedly. e. Go to step 1 to prepare for interval C 5. Otherwise, if all the TRBs of interval A are completed, the hardware will prepare for interval B. </quote> Could it be that in the observed case we just run directly into case 4 after issueing Stream Start. Since the frame interval we programm is so close to the current running frame interval. So actually applying the below patch would totally make sense. We could even increase that to an higher number, just to make sure that we would not run into an partially started frame interval when issueing the Start Transfer. Thanks, Michael -- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
Attachment:
signature.asc
Description: PGP signature