Michael Grzeschik wrote: > Hi Thinh, > > On Wed, Apr 28, 2021 at 01:45:11AM +0000, Thinh Nguyen wrote: >> Michael Grzeschik wrote: >>> On Tue, Apr 27, 2021 at 03:18:57AM +0000, Thinh Nguyen wrote: >>>> Hi Michael, >>>> >>>> Thinh Nguyen wrote: >>>>> Felipe Balbi wrote: >>>>>> >>>>>> HI, >>>>>> >>>>>> Michael Grzeschik <mgr@xxxxxxxxxxxxxx> writes: >>>>>> >>>>>> <big snip> >>>>>> >>>> >>>> >>>> <bigger snip> >>>> >>>> >>>>> I think I see the issue that Michael reported. >>>>> >>>>> The problem is that we're using num_pending_sgs to track both >>>>> pending SG >>>>> entries and queued SG entries. num_pending_sgs doesn't get updated >>>>> until >>>>> TRB completion interrupt (ie XferInProgress). Before the driver queues >>>>> more SG requests, it will check if there's any pending SG in the >>>>> started >>>>> request list before it prepares more. Since the num_pending_sgs >>>>> doesn't >>>>> get updated until the request is completed, the driver doesn't process >>>>> more until the request is completed. >>>>> >>>>> I need to review more on Michael's patches next week, but I think what >>>>> he suggested makes sense (in term of properly usage of queued sgs vs >>>>> pending sgs). BTW, please correct me if I'm wrong, but we do modify >>>>> num_queued_sgs. >>>>> >>>> >>>> There's still some issue with your patch. I think this should cover it. >>>> Let me know if it works for you. >>> >>> This works for me! Will you spin a proper patch from that? >> >> Sure. I can do that after 5.13-rc1 is released > > Great! > >>> >>>> Note: this however probably needs more "Tested-by" and reviews >>>> to make sure I'm not missing anything. I only ran some basic tests, >>>> and will need to run more. >>> >>> You may already have mine: >>> >>> Tested-by: Michael Grzeschik <m.grzeschik@xxxxxxxxxxxxxx> >>> >>>> Let me know if this makes sense. >>> >>> From what I understand about the issue and the purpose of all >>> variables this makes total sense to me. So thanks for taking over >>> and make a proper solution. >>> >>> Thanks, >>> Michael >>> >> >> Thanks for the Tested-by. >> >> Btw, any reason for using SG with transfer less than PAGE_SIZE? I assume >> your platform is 4KB, but you're splitting your 3KB transfer to smaller. >> Was it like this before? Note that DWC3 has a limited number of internal >> TRB cache. But what you're doing now is fine and doesn't impact >> performance. > > It all comes from the limitation of the uvc_video gadget. Look into the > patches I send to support sg in uvc_video driver. It just maps entries > from an sg list comming from videobuf2 to an req->sg list. In > uvc_video_alloc_requests the req->length will be set to req_size which > is calculated with: > > ep->maxpacket(1024) * maxburst(1-15) * mult(1-3) > > With maxburst = 1 and mult = 3 it currently reults in an req->length of: > > 1024 * 1 * 3 = 3072 I wasn't referring to this, I was just wondering why uvc uses SG. Correct me if I'm wrong, but isn't uvc allocates and uses contiguous buffer? Depend on the setup, normal request may show better performance. > > This request will use one extra sg created from the uvc driver adding > a header of 2 bytes per request and then has 3070 bytes left for the > payload which will look into the vb2 sg list and increment every sg > entrie with the the used offset until it reaches sg page boundaries > and in that case will have to add one extra sg. > > So this results in the funny patterns you see. > > We can increase the req->length to 46080 with a maxburst of 15 for usb3 > but I did not try that out yet. > I wouldn't recommend that. Unless you know you're using both DWC3x device and host, most other controllers we know cannot handle high isoc throughput, especially at bInterval=1. BR, Thinh