Hi Thinh, On Wed, Apr 28, 2021 at 01:45:11AM +0000, Thinh Nguyen wrote:
Michael Grzeschik wrote:On Tue, Apr 27, 2021 at 03:18:57AM +0000, Thinh Nguyen wrote:Hi Michael, Thinh Nguyen wrote:Felipe Balbi wrote:HI, Michael Grzeschik <mgr@xxxxxxxxxxxxxx> writes: <big snip><bigger snip>I think I see the issue that Michael reported. The problem is that we're using num_pending_sgs to track both pending SG entries and queued SG entries. num_pending_sgs doesn't get updated until TRB completion interrupt (ie XferInProgress). Before the driver queues more SG requests, it will check if there's any pending SG in the started request list before it prepares more. Since the num_pending_sgs doesn't get updated until the request is completed, the driver doesn't process more until the request is completed. I need to review more on Michael's patches next week, but I think what he suggested makes sense (in term of properly usage of queued sgs vs pending sgs). BTW, please correct me if I'm wrong, but we do modify num_queued_sgs.There's still some issue with your patch. I think this should cover it. Let me know if it works for you.This works for me! Will you spin a proper patch from that?Sure. I can do that after 5.13-rc1 is released
Great!
Note: this however probably needs more "Tested-by" and reviews to make sure I'm not missing anything. I only ran some basic tests, and will need to run more.You may already have mine: Tested-by: Michael Grzeschik <m.grzeschik@xxxxxxxxxxxxxx>Let me know if this makes sense.From what I understand about the issue and the purpose of all variables this makes total sense to me. So thanks for taking over and make a proper solution. Thanks, MichaelThanks for the Tested-by. Btw, any reason for using SG with transfer less than PAGE_SIZE? I assume your platform is 4KB, but you're splitting your 3KB transfer to smaller. Was it like this before? Note that DWC3 has a limited number of internal TRB cache. But what you're doing now is fine and doesn't impact performance.
It all comes from the limitation of the uvc_video gadget. Look into the patches I send to support sg in uvc_video driver. It just maps entries from an sg list comming from videobuf2 to an req->sg list. In uvc_video_alloc_requests the req->length will be set to req_size which is calculated with: ep->maxpacket(1024) * maxburst(1-15) * mult(1-3) With maxburst = 1 and mult = 3 it currently reults in an req->length of: 1024 * 1 * 3 = 3072 This request will use one extra sg created from the uvc driver adding a header of 2 bytes per request and then has 3070 bytes left for the payload which will look into the vb2 sg list and increment every sg entrie with the the used offset until it reaches sg page boundaries and in that case will have to add one extra sg. So this results in the funny patterns you see. We can increase the req->length to 46080 with a maxburst of 15 for usb3 but I did not try that out yet. Thanks, Michael
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index dd80e5ca8c78..040cc67b3361 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -1244,6 +1244,7 @@ static int dwc3_prepare_trbs_sg(struct dwc3_ep *dep, req->start_sg = sg_next(s); req->num_queued_sgs++; + req->num_pending_sgs--; /* * The number of pending SG entries may not correspond to the @@ -1251,7 +1252,7 @@ static int dwc3_prepare_trbs_sg(struct dwc3_ep *dep, * don't include unused SG entries. */ if (length == 0) { - req->num_pending_sgs -= req->request.num_mapped_sgs - req->num_queued_sgs; + req->num_pending_sgs = 0; break; } @@ -2867,15 +2868,15 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep, struct dwc3_trb *trb = &dep->trb_pool[dep->trb_dequeue]; struct scatterlist *sg = req->sg; struct scatterlist *s; - unsigned int pending = req->num_pending_sgs; + unsigned int num_queued = req->num_queued_sgs; unsigned int i; int ret = 0; - for_each_sg(sg, s, pending, i) { + for_each_sg(sg, s, num_queued, i) { trb = &dep->trb_pool[dep->trb_dequeue]; req->sg = sg_next(s); - req->num_pending_sgs--; + req->num_queued_sgs--; ret = dwc3_gadget_ep_reclaim_completed_trb(dep, req, trb, event, status, true); @@ -2898,7 +2899,7 @@ static int dwc3_gadget_ep_reclaim_trb_linear(struct dwc3_ep *dep, static bool dwc3_gadget_ep_request_completed(struct dwc3_request *req) { - return req->num_pending_sgs == 0; + return req->num_pending_sgs == 0 && req->num_queued_sgs == 0; } static int dwc3_gadget_ep_cleanup_completed_request(struct dwc3_ep *dep, @@ -2907,7 +2908,7 @@ static int dwc3_gadget_ep_cleanup_completed_request(struct dwc3_ep *dep, { int ret; - if (req->num_pending_sgs) + if (req->request.num_mapped_sgs) ret = dwc3_gadget_ep_reclaim_trb_sg(dep, req, event, status); else
-- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
Attachment:
signature.asc
Description: PGP signature