On Sun, May 12, 2024, Michael Grzeschik wrote: > On Sat, May 11, 2024 at 12:51:57AM +0000, Thinh Nguyen wrote: > > On Thu, May 09, 2024, Michael Grzeschik wrote: > > > On Wed, May 08, 2024 at 11:03:00PM +0000, Thinh Nguyen wrote: > > > > On Sun, May 05, 2024, Michael Grzeschik wrote: > > > > > On Wed, Apr 24, 2024 at 01:51:01AM +0000, Thinh Nguyen wrote: > > > > > > > > > > > > > > > > > > > Right. Unfortunately, dwc3 can only "guess" when UVC function stops > > > > > > pumping more request or whether it's due to some large latency. The > > > > > > logic to workaround this underrun issue will not be foolproof. Perhaps > > > > > > we can improve upon it, but the solution is better implement at the UVC > > > > > > function driver. > > > > > > > > > > Yes, the best way to solve this is in the uvc driver. > > > > > > > > > > > I thought we have the mechanism in UVC function now to ensure queuing > > > > > > enough zero-length requests to account for underrun/latency issue? > > > > > > What's the issue now? > > > > > > > > > > This is actually only partially true. Even with the zero-length packages > > > > > it is possible that we run into underruns. This is why we implemented > > > > > this patch. This has happened because another interrupt thread with the > > > > > same prio on the same CPU as this interrupt thread was keeping the CPU > > > > > > > > Do you have the data on the worst latency? > > > > > > It was something a bit more then around 2ms AFAIR. Since with one frame > > > enqueued we only trigger the interrupt every 16 requests (16*125us = 2ms) > > > > > > So with at least 2ms latency we did hit the sweet spot in several cases. > > > > For 2ms, we should be able to handle this with the zero-length requests. > > When the interrupt thread is the one that is enqueuing also the > zero-requests (like the uvc_video gadget) is doing now, we won't be able > to do that. How long does enqueuing take? Does it take longer than the number of intervals that it enqueues? > > > > > > > > Can this other interrupt thread lower its priority relative to UVC? For > > > > isoc endpoint, data is time critical. > > > > > > The details are not that important. Sure the is a bug, that needed to be > > > solved. But all I wanted is to improve the overal dwc3 driver. > > > > > > > Currently dwc3 can have up to 255 TRBs per endpoint, potentially 255 > > > > zero-length requests. That's 255 uframe, or roughly ~30ms. Is your worst > > > > latency more than 30ms? ie. no handling of transfer completion and > > > > ep_queue for the whole 255 intervals or 30ms. If that's the case, we > > > > have problems that cannot just be solved in dwc3. > > > > > > Yes. But as mentioned above, this was not the case. Speaking of, there > > > is currently a bug in the uvc_video driver, that is not taking into > > > acount that actually every zero-length request should without exception > > > need to trigger an interrupt. > > > > Not necessarily, you can send multiple set of zero-length requests > > where, for example, IOC on the last request of the set. > > Right. But for this we have to know if the last request that will be > enqueued will be followed by an actual data request. As the uvc_video > gadget is implemented now, we can not do this. > > It is only checking if the prepared list is empty and then it is > enqueueing zero or data requests from the complete handler depending > from the outcome. It does not know if on the next enqueue the prepared > list will have some requests ready. Can we check if the prepare list should always have X amount of requests instead of empty? If not, fill that up to the X amount with zero-length requests. BR, Thinh > > > > Currently we also only scatter them over > > > the 16ms period, like with the actual payload. But since we feed the > > > video stream with the interrupts, we loose 2ms of potential ep_queue > > > calls with actual payload in the worst case. > > > > > > My patch is already in the stack and will be send today. > > > > > > > > busy. As the dwc3 interrupt thread get to its call, the time was already > > > > > over and the hw was already drained, although the started list was not > > > > > yet empty, which was causing the next queued requests to be queued to > > > > > late. (zero length or not) > > > > > > > > > > Yes, this needed to be solved on the upper level first, by moving the > > > > > long running work of the other interrupt thread to another thread or > > > > > even into the userspace. > > > > > > > > Right. > > > > > > > > > > > > > > However I thought it would be great if we could somehow find out in > > > > > the dwc3 core and make the pump mechanism more robust against such > > > > > late enqueues. > > > > > > > > The dwc3 core handling of events and ep_queue is relatively quick. I'm > > > > all for any optimization in the dwc3 core for performance. However, I > > > > don't want to just continue looking for workaround in the dwc3 core > > > > without looking to solve the issue where it should be. I don't want to > > > > sacrifice complexity and/or performance to other applications for just > > > > UVC. > > > > > > I totally understand this. And as we already found out more and more > > > about the underlying complexity of the dwc3 driver I see more and more > > > clearer how we have to handle the corner cases. I just started this > > > conversation with Avichal and you in the other thread. > > > > > > https://lore.kernel.org/all/17192e0f-7f18-49ae-96fc-71054d46f74a@xxxxxxxxxx/ > > > > > > I think there is some work to come. As to be sure that everybody is on > > > the same page I will prepare a roadmap on how to proceed and what to > > > discuss. There are many cases interfering with each other which make the > > > solution pretty complex. > > > > That's great. Let's do that so we can resolve this issue. > > Good > > > > > > This all started with that series. > > > > > > > > > > https://lore.kernel.org/all/20240307-dwc3-gadget-complete-irq-v1-0-4fe9ac0ba2b7@xxxxxxxxxxxxxx/ > > > > > > > > > > And patch 2 of this series did work well so far. The next move was this > > > > > patch. > > > > > > > > > > Since the last week debugging we found out that it got other issues. > > > > > It is not allways save to read the HWO bit, from the driver. > > > > > > > > > > Turns out that after an new TRB was prepared with the HWO bit set > > > > > it is not save to read immideatly back from that value as the hw > > > > > will be doing some operations on that exactly new prepared TRB. > > > > > > > > > > We ran into this problem when applying this patch. The trb buffer list > > > > > was actually filled but we hit a false positive where the latest HWO bit > > > > > was 0 (probably due to the hw action in the background) and therefor > > > > > went into end transfer. > > > > > > > > > > > > > Thanks for the update. > > > > > > > -- > Pengutronix e.K. | | > Steuerwalder Str. 21 | http://www.pengutronix.de/ | > 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | > Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |