Michael Grzeschik wrote: > On Wed, Apr 28, 2021 at 11:25:05PM +0000, Thinh Nguyen wrote: >> Michael Grzeschik wrote: >>> Hi Thinh, >>> >>> On Wed, Apr 28, 2021 at 01:45:11AM +0000, Thinh Nguyen wrote: >>>> Michael Grzeschik wrote: >>>>> On Tue, Apr 27, 2021 at 03:18:57AM +0000, Thinh Nguyen wrote: >>>>>> Hi Michael, >>>>>> >>>>>> Thinh Nguyen wrote: >>>>>>> Felipe Balbi wrote: >>>>>>>> >>>>>>>> HI, >>>>>>>> >>>>>>>> Michael Grzeschik <mgr@xxxxxxxxxxxxxx> writes: >>>>>>>> >>>>>>>> <big snip> >>>>>>>> >>>>>> >>>>>> >>>>>> <bigger snip> >>>>>> >>>>>> >>>>>>> I think I see the issue that Michael reported. >>>>>>> >>>>>>> The problem is that we're using num_pending_sgs to track both >>>>>>> pending SG >>>>>>> entries and queued SG entries. num_pending_sgs doesn't get updated >>>>>>> until >>>>>>> TRB completion interrupt (ie XferInProgress). Before the driver >>>>>>> queues >>>>>>> more SG requests, it will check if there's any pending SG in the >>>>>>> started >>>>>>> request list before it prepares more. Since the num_pending_sgs >>>>>>> doesn't >>>>>>> get updated until the request is completed, the driver doesn't >>>>>>> process >>>>>>> more until the request is completed. >>>>>>> >>>>>>> I need to review more on Michael's patches next week, but I think >>>>>>> what >>>>>>> he suggested makes sense (in term of properly usage of queued sgs vs >>>>>>> pending sgs). BTW, please correct me if I'm wrong, but we do modify >>>>>>> num_queued_sgs. >>>>>>> >>>>>> >>>>>> There's still some issue with your patch. I think this should >>>>>> cover it. >>>>>> Let me know if it works for you. >>>>> >>>>> This works for me! Will you spin a proper patch from that? >>>> >>>> Sure. I can do that after 5.13-rc1 is released >>> >>> Great! >>> >>>>> >>>>>> Note: this however probably needs more "Tested-by" and reviews >>>>>> to make sure I'm not missing anything. I only ran some basic tests, >>>>>> and will need to run more. >>>>> >>>>> You may already have mine: >>>>> >>>>> Tested-by: Michael Grzeschik <m.grzeschik@xxxxxxxxxxxxxx> >>>>> >>>>>> Let me know if this makes sense. >>>>> >>>>> From what I understand about the issue and the purpose of all >>>>> variables this makes total sense to me. So thanks for taking over >>>>> and make a proper solution. >>>>> >>>>> Thanks, >>>>> Michael >>>>> >>>> >>>> Thanks for the Tested-by. >>>> >>>> Btw, any reason for using SG with transfer less than PAGE_SIZE? I >>>> assume >>>> your platform is 4KB, but you're splitting your 3KB transfer to >>>> smaller. >>>> Was it like this before? Note that DWC3 has a limited number of >>>> internal >>>> TRB cache. But what you're doing now is fine and doesn't impact >>>> performance. >>> >>> It all comes from the limitation of the uvc_video gadget. Look into the >>> patches I send to support sg in uvc_video driver. It just maps entries >>> from an sg list comming from videobuf2 to an req->sg list. In >>> uvc_video_alloc_requests the req->length will be set to req_size which >>> is calculated with: >>> >>> ep->maxpacket(1024) * maxburst(1-15) * mult(1-3) >>> >>> With maxburst = 1 and mult = 3 it currently reults in an req->length of: >>> >>> 1024 * 1 * 3 = 3072 >> >> I wasn't referring to this, I was just wondering why uvc uses SG. >> Correct me if I'm wrong, but isn't uvc allocates and uses contiguous >> buffer? > > Not necessarily. Depending on the source creating vb2 buffers with mmu > enabled or directly you may get different patterns. The patches I send > are able to also handle contiguous buffers. In that case vb2 would just > hand over a scatter list with one big entry, which would be totally fine. > In that case we would not have to scatter it for the gadget controller. > >> Depend on the setup, normal request may show better performance. > > Right. In my setup the data is coming from an mmu. > I see. Thanks for the clarification. Thinh