Re: DWC3 Isoc Gadget Missed Interrupts on Start Transfer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Michael Grzeschik wrote:
> On Wed, Mar 02, 2022 at 10:22:55PM +0000, Thinh Nguyen wrote:
>> Michael Grzeschik wrote:
>>> I am currently debugging a strange error with transferring isoc requests
>>> with
>>> the current mainline driver.
>>>
>>> When the pending and started list is empty the complete interrupt
>>> handler is
>>> sending an End Transfer to the controller. This End Transfer will
>>> then get
>>> successfully completed.
>>>
>>>     irq/50-dwc3-305     [000] d..1.   166.661845: dwc3_gadget_ep_cmd:
>>> ep5in: cmd 'End Transfer' [b0d08] params 00000000 00000000 00000000 -->
>>> status: Successful
>>>     irq/50-dwc3-305     [000] d..1.   166.661877: dwc3_event: event
>>> (080b01d6): ep5in: Endpoint Command Complete
>>>     irq/50-dwc3-305     [000] d..1.   166.661979: dwc3_event: event
>>> (482a10d6): ep5in: Transfer Not Ready [0000482a] (Not Active)
>>>
>>>
>>> After this the Transfer needs to start again. But unfortunately this
>>> sometimes
>>> does not succeed. You see in the following trace that the Start of
>>> the just
>>> prepared trbs did complete with an -18.
>>>
>>>    kworker/u5:2-347     [001] d..2.   166.691288: dwc3_prepare_trb:
>>> ep5in: trb 000000007ff95a12 (E233:D232) buf 0000ffffffb1c2a8 size 1x 12
>>> ctrl 00000465:00000000 (HlCs:Sc:isoc-first)
>>>    kworker/u5:2-347     [001] d..2.   166.691293: dwc3_prepare_trb:
>>> ep5in: trb 00000000f9cd175e (E234:D232) buf 0000ffffffb1d000 size 1x
>>> 7156 ctrl 00000471:00000000 (Hlcs:Sc:isoc)
> 
> [snip]
> 
>>>
>>> The intermediate Bus Expiry is not always present in that case, but also
>>> completely acceptable as we loop
>>> over the Start Transfer until this succeeds.
>>>
>>> Do you have any Idea why this succeeded Start Streaming does come back
>>> with an
>>> Missed Transfer on the first trb?
>>>
>>
>> Hi Michael,
>>
>> When sending the trace, please leave it unfiltered. It's missing some
>> trace events that I want to check.
> 
> Right, it was a mix of trace_event:dwc3:* and ftrace_filter=dwc*. So I
> kind of felt filtering out the interesting part would help.
> 
> Anyway. Here is the same situation in another trace, with all
> trace_events from dwc3:
> 

<snip>

> 
>> Please check if the host is actually polling for data at that uframe.
>> Each TRB is scheduled for a particular interval. If the host doesn't
>> request for data at that particular interval, then it's a missed isoc
>> also (err -18). So, if you have a usb traffic analyzer, please verify
>> this. Also, maybe try with a different host vendor to see if the
>> behavior is the same.
> 
> The behavior seems to differ if I attach different hosts. I have no
> traffic analyzer. Would usbmon under linux host work for that? How would

usbmon doesn't see usb packet level. So no, usbmon doesn't work.

> I synchronize the gadget request and host urbs to know which frame
> interval would have been the one that was missed in my logs?

It will be difficult. You'd have to check the expected interval the urb
was scheduled for from xhci driver, and you may have to check the
MFINDEX. You'd also need to determine a point of reference that the data
received from device is for a particular interval. Even if you have all
this info, you'd have to assume everything the host controller reported
is accurate and not missing anything.

> 
> Is there any way the gadget driver could see if for that interval the
> host did not poll for data? OR are there bits to get this behaviour
> predictable and the driver to react on?
> 

You can check if there any "missed service error" from host, it may give
some clues, but not the whole picture.

> Otherwise, why should on an ongoing stream the host not poll
> for data in any interval. What would be the constraints to
> ignore one interval.

The host driver may not intentionally does it. There are many reasons
where the host controller not poll for data, and I've seen it often. It
can be because of the driver handling interrupt latency/delay that
prevents the driver from feeding isoc TD/TRBs to the controller in time,
causing underflow and starve the controller. Some other time it's the
controller that decided the bandwidth is not sufficient for isoc
transfer or phy signal instability and cause a missed service error etc.

Anyway, in this particular case, from my experience, I don't think it's
the host that's missing a poll. It's probably because the device
scheduling it too aggressively early. Can you try to schedule on the 2nd
interval instead and see if it helps in your case.

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 3e75eaa13abc..b9ec7626b080 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2104,7 +2104,7 @@ static int __dwc3_gadget_start_isoc(struct dwc3_ep
*dep)
        }

        for (i = 0; i < DWC3_ISOC_MAX_RETRIES; i++) {
-               dep->frame_number = DWC3_ALIGN_FRAME(dep, i + 1);
+               dep->frame_number = DWC3_ALIGN_FRAME(dep, i + 2);

                ret = __dwc3_gadget_kick_transfer(dep);
                if (ret != -EAGAIN)



Also, there's another issue in the dwc3 driver. Currently it checks the
event status and reported it to all the requests. I notice that you (or
the device UVC) don't set interrupt for every request but about every
10th request? Not all of the completed requests failed with missed isoc,
but the interrupt event status incorrectly set -18 for all of them. The
dwc3 driver should check the TRB status for each individual TRB and
return the proper status to the associated request.

Thanks,
Thinh




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux