On Mon, Apr 17, 2023 at 7:49 PM Thinh Nguyen <Thinh.Nguyen@xxxxxxxxxxxx> wrote: > > On Mon, Apr 17, 2023, Avichal Rakesh wrote: > > > > On 4/17/23 06:49, Greg KH wrote: > > > On Fri, Apr 14, 2023 at 02:03:02PM -0700, Avichal Rakesh wrote: > > > > > >> This problem may be further exaggerated by the DWC3 controller driver > > >> (which is what my device has) not setting the IMI flag when > > >> no_interrupt flag is set > > >> (https://urldefense.com/v3/__https://lore.kernel.org/all/ced336c84434571340c07994e3667a0ee284fefe.1666735451.git.Thinh.Nguyen@xxxxxxxxxxxx/)?__;!!A4F2R9G_pg!fB0krUX6qsOXrBsOHzLstVFWqTLL9ncwGFXxlxK0eedLY_29XdwjXRtbO-EWv1eX5okN1rOeBZspZ21KSb5b$ > > >> UVC Gadget Driver sets the no_interrupt flag for ~3/4 of its queued > > >> usb_request, so an ISOC failure may not immediately interrupt the UVC > > >> gadget driver, leaving more time for the frame to finish encoding. > > >> > > >> I couldn't find any concrete error handling rules in the UVC specs, so > > >> I am not sure what the proper solution here is. To try out, I created > > >> a patch (attached below) that dequeues all queued usb_requests from > > >> the endpoint in case of an ISOC failure and clears the uvc buffer > > >> queue. This eliminated the partial frames with no perceivable frame > > >> drops. > > >> > > >> So my questions here are: > > >> 1. Is this a known issue, and if so are there workarounds for it? > > >> 2. If the answer to above is "No", does the explanation and mitigation > > >> seem reasonable? > > >> > > >> Patch follows (mostly for illustration, I can formalize it if > > >> needed!). It adds a new 'req_inflight' list to track queued > > >> usb_requests that have not been given back to the gadget driver and > > >> drops all the queued requests in case of an ISOC failure. The other > > >> changes are for the extra bookkeeping required to handle dropping all > > >> frames. I haven't been able to confirm it, but as far as I can tell > > >> the issue exists at ToT as well. > > >> > > Perhaps this conversation with Jeff may explain the issues you observed: > https://lore.kernel.org/all/20221021164349.fft4yqnxuztsqdeu@xxxxxxxxxxxx/ > > To summarized the long conversation, the UVC needs to maintain a > constant queue of usb requests. Each request is scheduled for a specific > interval. If the UVC couldn't keep up feeding requests to the > controller, then we will have stale requests and missed isoc. > > From the discussion and review, the UVC couldn't queue requests fast > enough. The problem is exacerbated when you reduce the interrupt > frequency with no_interrupt setting. The dwc3 driver has some basic > mechanism to detect for underrun by checking if the queue is empty, but > that's not enough to workaround it. > > Your suggestion to dequeue the request would mean the controller driver > will reschedule the isoc transfer again, which reschedule the next > request for a new interval (potentially avoid being stale). That's fine, > but that may not be a great solution because we're still playing catch > up and the missed isoc already happened. > > You may try to do the followings to mitigate this issue: > 1) Make sure UVC to only queuing rate is constant. Perhaps queue dummy > requests in between time gaps if need be? > This is extremely helpful, thank you! I have a somewhat basic question: For an isoc request, is its "freshness" determined from the time at which usb_ep_queue is called, or from the time at which the previous request was transferred to the host? If the "freshness" is determined from the time 'usb_ep_request' was called, I wonder if the "underrun" is artificial because UVC Gadget driver pumps all free usb_requests at once, and then waits around doing nothing while the usb_requests get stale in the controller's queue. In this case, just slowing the encode loop to wait a little before queuing more requests should do effectively what you suggest above? > 2) Enhance dwc3 to detect underrun from UVC on request queue. ie. If the > queue is empty and a new request is queue, reschedule the isoc transfer. > This will break some other devices if time guarantee is required. So > perhaps an additional flag is needed for this change of behavior. > I am not as familiar with the DWC3, but just to clarify: Is your suggestion to reschedule the stale requests if the controller's send queue is empty (underrun) and the UVC gadget driver queues a new request? Depending on how many stale requests there are, would that run the risk of timing out the new request as well? > BR, > Thinh Best Avi