Michael Grzeschik wrote: > On Tue, Apr 27, 2021 at 03:18:57AM +0000, Thinh Nguyen wrote: >> Hi Michael, >> >> Thinh Nguyen wrote: >>> Felipe Balbi wrote: >>>> >>>> HI, >>>> >>>> Michael Grzeschik <mgr@xxxxxxxxxxxxxx> writes: >>>> >>>> <big snip> >>>> >> >> >> <bigger snip> >> >> >>> I think I see the issue that Michael reported. >>> >>> The problem is that we're using num_pending_sgs to track both pending SG >>> entries and queued SG entries. num_pending_sgs doesn't get updated until >>> TRB completion interrupt (ie XferInProgress). Before the driver queues >>> more SG requests, it will check if there's any pending SG in the started >>> request list before it prepares more. Since the num_pending_sgs doesn't >>> get updated until the request is completed, the driver doesn't process >>> more until the request is completed. >>> >>> I need to review more on Michael's patches next week, but I think what >>> he suggested makes sense (in term of properly usage of queued sgs vs >>> pending sgs). BTW, please correct me if I'm wrong, but we do modify >>> num_queued_sgs. >>> >> >> There's still some issue with your patch. I think this should cover it. >> Let me know if it works for you. > > This works for me! Will you spin a proper patch from that? Sure. I can do that after 5.13-rc1 is released > >> Note: this however probably needs more "Tested-by" and reviews >> to make sure I'm not missing anything. I only ran some basic tests, >> and will need to run more. > > You may already have mine: > > Tested-by: Michael Grzeschik <m.grzeschik@xxxxxxxxxxxxxx> > >> Let me know if this makes sense. > > From what I understand about the issue and the purpose of all > variables this makes total sense to me. So thanks for taking over > and make a proper solution. > > Thanks, > Michael > Thanks for the Tested-by. Btw, any reason for using SG with transfer less than PAGE_SIZE? I assume your platform is 4KB, but you're splitting your 3KB transfer to smaller. Was it like this before? Note that DWC3 has a limited number of internal TRB cache. But what you're doing now is fine and doesn't impact performance. BR, Thinh >> >> >> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >> index dd80e5ca8c78..040cc67b3361 100644 >> --- a/drivers/usb/dwc3/gadget.c >> +++ b/drivers/usb/dwc3/gadget.c >> @@ -1244,6 +1244,7 @@ static int dwc3_prepare_trbs_sg(struct dwc3_ep >> *dep, >> req->start_sg = sg_next(s); >> >> req->num_queued_sgs++; >> + req->num_pending_sgs--; >> >> /* >> * The number of pending SG entries may not correspond >> to the >> @@ -1251,7 +1252,7 @@ static int dwc3_prepare_trbs_sg(struct dwc3_ep >> *dep, >> * don't include unused SG entries. >> */ >> if (length == 0) { >> - req->num_pending_sgs -= >> req->request.num_mapped_sgs - req->num_queued_sgs; >> + req->num_pending_sgs = 0; >> break; >> } >> >> @@ -2867,15 +2868,15 @@ static int >> dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep, >> struct dwc3_trb *trb = &dep->trb_pool[dep->trb_dequeue]; >> struct scatterlist *sg = req->sg; >> struct scatterlist *s; >> - unsigned int pending = req->num_pending_sgs; >> + unsigned int num_queued = req->num_queued_sgs; >> unsigned int i; >> int ret = 0; >> >> - for_each_sg(sg, s, pending, i) { >> + for_each_sg(sg, s, num_queued, i) { >> trb = &dep->trb_pool[dep->trb_dequeue]; >> >> req->sg = sg_next(s); >> - req->num_pending_sgs--; >> + req->num_queued_sgs--; >> >> ret = dwc3_gadget_ep_reclaim_completed_trb(dep, req, >> trb, event, status, true); >> @@ -2898,7 +2899,7 @@ static int >> dwc3_gadget_ep_reclaim_trb_linear(struct dwc3_ep *dep, >> >> static bool dwc3_gadget_ep_request_completed(struct dwc3_request *req) >> { >> - return req->num_pending_sgs == 0; >> + return req->num_pending_sgs == 0 && req->num_queued_sgs == 0; >> } >> >> static int dwc3_gadget_ep_cleanup_completed_request(struct dwc3_ep *dep, >> @@ -2907,7 +2908,7 @@ static int >> dwc3_gadget_ep_cleanup_completed_request(struct dwc3_ep *dep, >> { >> int ret; >> >> - if (req->num_pending_sgs) >> + if (req->request.num_mapped_sgs) >> ret = dwc3_gadget_ep_reclaim_trb_sg(dep, req, event, >> status); >> else >