Re: [PATCH 2/2] dwc3: gadget: fix tracking of used sgs in request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Michael Grzeschik <mgr@xxxxxxxxxxxxxx> writes:
> On Thu, Apr 22, 2021 at 01:56:09PM +0300, Felipe Balbi wrote:
>>
>>Hi,
>>
>>(subject format as well)
>>
>>Michael Grzeschik <m.grzeschik@xxxxxxxxxxxxxx> writes:
>>> The variable pending_sgs was used to keep track of handled
>>> sgs in one request. But instead queued_sgs is being decremented
>>
>>no, it wasn't. If the total number of entries in the scatter list is 'x'
>>and we have transferred (completed) 'y' entries, then pending_sgs should
>>be (x - y).
>>
>>> on every handled sg. This patch fixes the usage of the variable
>>> to use queued_sgs instead as intended.
>>>
>>> Signed-off-by: Michael Grzeschik <m.grzeschik@xxxxxxxxxxxxxx>
>>> ---
>>>  drivers/usb/dwc3/gadget.c | 8 ++++----
>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>> index 118b5bcc565d6..2d7d861b13b31 100644
>>> --- a/drivers/usb/dwc3/gadget.c
>>> +++ b/drivers/usb/dwc3/gadget.c
>>> @@ -2856,7 +2856,7 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep,
>>>  	struct dwc3_trb *trb = &dep->trb_pool[dep->trb_dequeue];
>>>  	struct scatterlist *sg = req->sg;
>>>  	struct scatterlist *s;
>>> -	unsigned int pending = req->num_pending_sgs;
>>> +	unsigned int pending = req->num_queued_sgs;
>>>  	unsigned int i;
>>>  	int ret = 0;
>>>
>>> @@ -2864,7 +2864,7 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep,
>>>  		trb = &dep->trb_pool[dep->trb_dequeue];
>>>
>>>  		req->sg = sg_next(s);
>>> -		req->num_pending_sgs--;
>>> +		req->num_queued_sgs--;
>>
>>no, this is wrong. queued shouldn't be modified as it comes straight
>>from the gadget driver. This is the number of entries in the request
>>that the gadget driver gave us. We don't want to modify it.
>
> Right, but pending_sgs than has two use cases. One to track the mapped
> sgs that got not queued. And one here to to track the "queued sgs" that
> got dequeued.

no, we have num_mapped_sgs for the number of mapped sgs. It's just that
right before kicking the transfer the number of pending sgs and the
number of mapped sgs is the same.

>>>
>>>  		ret = dwc3_gadget_ep_reclaim_completed_trb(dep, req,
>>>  				trb, event, status, true);
>>> @@ -2887,7 +2887,7 @@ static int dwc3_gadget_ep_reclaim_trb_linear(struct dwc3_ep *dep,
>>>
>>>  static bool dwc3_gadget_ep_request_completed(struct dwc3_request *req)
>>>  {
>>> -	return req->num_pending_sgs == 0;
>>> +	return req->num_queued_sgs == 0;
>>
>>nope, request is, indeed, completed when there no more pending entries
>>to be consumed.
>>
>>What sort of problem are you dealing with? Got any way of reproducing
>>it? Got some trace output showing the issue?
>
> I digged a little bit deeper to fully understand the issue here. What
> I found is that in dwc3_prepare_trbs will make two assumptions on
> num_pending_sgs.
>
> When the real purpose of the variable is to track the dequeued trbs.

its purpose is *not* to track the dequeued trbs.

> Than we have to fix the started list handling in the dwc3_prepare_trbs.
>
> The comment in the function says:
>
>          /*
>           * We can get in a situation where there's a request in the started list
>           * but there weren't enough TRBs to fully kick it in the first time
>           * around, so it has been waiting for more TRBs to be freed up.
>           *
>           * In that case, we should check if we have a request with pending_sgs
>           * in the started list and prepare TRBs for that request first,
>           * otherwise we will prepare TRBs completely out of order and that will
>           * break things.
>           */
>          list_for_each_entry(req, &dep->started_list, list) {
> 		if (req->num_pending_sgs > 0) {
> 		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> This condition seems to be made on a wrong assumption, thinking the
> num_pending_sgs was decremented after dwc3_prepare_one_trb was called on parts
> of that requests sgs but not all.

say num_mapped_sgs = 500. The driver has 255 TRBs for use (+1 link
trb). Clearly not all SGs will fit in one go. Driver will set
num_pending_sgs to 500, because that's the number of sgs that need to be
transferred.

For each completion we will decrement num_pending_sgs. Assuming all 255
were free and used up, num_pending_sgs will be 245 at this point. Driver
*must* try to kick these 245 TRBs.

> But the completion path can not also depend on that variable to be decremented
> after parts of that sgs get handled. Therefor I came up with this second patch.
>
> What do you think, would be the right way to solve this?

unless you can show that a problem really exists, I don't think we
should do anything. Got a minimal reproduction method somewhere?
Tracepoint of the problem?

> The second issue I see in dwc3_prepare_trbs is the bail out of iterations over
> the pending and starting lists. Whenever one case of (req->num_pending_sgs > 0)
> will be true after calling dwc3_prepare_trbs_sg, the function returns immediately.
>
> In my case, where my uvc_video now enqueues up to 64 requests, every single

64 requests is not (necessarily) the same as 64 TRBs or 64 SG
entries. You need to explain this a little better.

> kick_transfer called from one ep_queue will ensure only one call of
> dwc3_prepare_trbs_sg on one entry of the pending list. This bottleneck makes
> the hardware refill to slow and the hardware will drain fast even though enough
> pending buffers are there.

what's the size of the buffers? are they contiguous or held in scatter
lists? Got tracepoints of the problem in question?

> I suggest to remove those returns.

we may get there when you successfully show the existence of a problem
:-)

-- 
balbi

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux