Hi, Michael Grzeschik <mgr@xxxxxxxxxxxxxx> writes: > On Thu, Apr 22, 2021 at 01:56:09PM +0300, Felipe Balbi wrote: >> >>Hi, >> >>(subject format as well) >> >>Michael Grzeschik <m.grzeschik@xxxxxxxxxxxxxx> writes: >>> The variable pending_sgs was used to keep track of handled >>> sgs in one request. But instead queued_sgs is being decremented >> >>no, it wasn't. If the total number of entries in the scatter list is 'x' >>and we have transferred (completed) 'y' entries, then pending_sgs should >>be (x - y). >> >>> on every handled sg. This patch fixes the usage of the variable >>> to use queued_sgs instead as intended. >>> >>> Signed-off-by: Michael Grzeschik <m.grzeschik@xxxxxxxxxxxxxx> >>> --- >>> drivers/usb/dwc3/gadget.c | 8 ++++---- >>> 1 file changed, 4 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >>> index 118b5bcc565d6..2d7d861b13b31 100644 >>> --- a/drivers/usb/dwc3/gadget.c >>> +++ b/drivers/usb/dwc3/gadget.c >>> @@ -2856,7 +2856,7 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep, >>> struct dwc3_trb *trb = &dep->trb_pool[dep->trb_dequeue]; >>> struct scatterlist *sg = req->sg; >>> struct scatterlist *s; >>> - unsigned int pending = req->num_pending_sgs; >>> + unsigned int pending = req->num_queued_sgs; >>> unsigned int i; >>> int ret = 0; >>> >>> @@ -2864,7 +2864,7 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep, >>> trb = &dep->trb_pool[dep->trb_dequeue]; >>> >>> req->sg = sg_next(s); >>> - req->num_pending_sgs--; >>> + req->num_queued_sgs--; >> >>no, this is wrong. queued shouldn't be modified as it comes straight >>from the gadget driver. This is the number of entries in the request >>that the gadget driver gave us. We don't want to modify it. > > Right, but pending_sgs than has two use cases. One to track the mapped > sgs that got not queued. And one here to to track the "queued sgs" that > got dequeued. no, we have num_mapped_sgs for the number of mapped sgs. It's just that right before kicking the transfer the number of pending sgs and the number of mapped sgs is the same. >>> >>> ret = dwc3_gadget_ep_reclaim_completed_trb(dep, req, >>> trb, event, status, true); >>> @@ -2887,7 +2887,7 @@ static int dwc3_gadget_ep_reclaim_trb_linear(struct dwc3_ep *dep, >>> >>> static bool dwc3_gadget_ep_request_completed(struct dwc3_request *req) >>> { >>> - return req->num_pending_sgs == 0; >>> + return req->num_queued_sgs == 0; >> >>nope, request is, indeed, completed when there no more pending entries >>to be consumed. >> >>What sort of problem are you dealing with? Got any way of reproducing >>it? Got some trace output showing the issue? > > I digged a little bit deeper to fully understand the issue here. What > I found is that in dwc3_prepare_trbs will make two assumptions on > num_pending_sgs. > > When the real purpose of the variable is to track the dequeued trbs. its purpose is *not* to track the dequeued trbs. > Than we have to fix the started list handling in the dwc3_prepare_trbs. > > The comment in the function says: > > /* > * We can get in a situation where there's a request in the started list > * but there weren't enough TRBs to fully kick it in the first time > * around, so it has been waiting for more TRBs to be freed up. > * > * In that case, we should check if we have a request with pending_sgs > * in the started list and prepare TRBs for that request first, > * otherwise we will prepare TRBs completely out of order and that will > * break things. > */ > list_for_each_entry(req, &dep->started_list, list) { > if (req->num_pending_sgs > 0) { > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > This condition seems to be made on a wrong assumption, thinking the > num_pending_sgs was decremented after dwc3_prepare_one_trb was called on parts > of that requests sgs but not all. say num_mapped_sgs = 500. The driver has 255 TRBs for use (+1 link trb). Clearly not all SGs will fit in one go. Driver will set num_pending_sgs to 500, because that's the number of sgs that need to be transferred. For each completion we will decrement num_pending_sgs. Assuming all 255 were free and used up, num_pending_sgs will be 245 at this point. Driver *must* try to kick these 245 TRBs. > But the completion path can not also depend on that variable to be decremented > after parts of that sgs get handled. Therefor I came up with this second patch. > > What do you think, would be the right way to solve this? unless you can show that a problem really exists, I don't think we should do anything. Got a minimal reproduction method somewhere? Tracepoint of the problem? > The second issue I see in dwc3_prepare_trbs is the bail out of iterations over > the pending and starting lists. Whenever one case of (req->num_pending_sgs > 0) > will be true after calling dwc3_prepare_trbs_sg, the function returns immediately. > > In my case, where my uvc_video now enqueues up to 64 requests, every single 64 requests is not (necessarily) the same as 64 TRBs or 64 SG entries. You need to explain this a little better. > kick_transfer called from one ep_queue will ensure only one call of > dwc3_prepare_trbs_sg on one entry of the pending list. This bottleneck makes > the hardware refill to slow and the hardware will drain fast even though enough > pending buffers are there. what's the size of the buffers? are they contiguous or held in scatter lists? Got tracepoints of the problem in question? > I suggest to remove those returns. we may get there when you successfully show the existence of a problem :-) -- balbi
Attachment:
signature.asc
Description: PGP signature