Hi, Thinh Nguyen <thinh.nguyen@xxxxxxxxxxxx> writes: > On 1/17/2019 11:12 PM, Felipe Balbi wrote: >> Hi, >> >> Thinh Nguyen <thinh.nguyen@xxxxxxxxxxxx> writes: >>> @@ -1409,15 +1407,11 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep, >>> if (r == req) { >>> /* wait until it is processed */ >>> dwc3_stop_active_transfer(dep, true); >>> >>> I ran into a regression with this patch. DWC3 will cleanup cancelled >>> requests on END_TRANSFER command completion. However, if for some >>> reason, the driver is unable to send the command, it will never >> by why wouldn't the driver send the command? That seems to be the error >> we should be looking at. Got some tracepoints available? > > Yes. Attached is the tracepoint. Device got stuck on dequeue on the last > line of the tracepoint. to me, it seems that we're trying to dequeue the same request twice. Here's a filtered list for the request in question. Notice last two lines. They are both dwc3_ep_dequeue() without a matching dwc3_ep_queue(): irq/16-dwc3-3983 [004] d... 130.997794: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 8192/8192 zsI ==> 0 file-storage-3982 [006] d... 130.998090: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 130.998535: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 130.998584: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 130.999196: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 130.999245: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 130.999871: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 130.999917: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.000545: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 131.000554: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/13 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.000830: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 13/13 zsI ==> 0 file-storage-3982 [006] d... 131.001021: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.001763: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 131.001809: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.002437: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 131.002483: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.003097: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 131.003144: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/8192 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.003610: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 8192/8192 zsI ==> 0 file-storage-3982 [006] d... 131.003790: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.004186: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 131.004217: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.004848: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 131.004892: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.005514: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 131.005560: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.006174: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 131.006218: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/13 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.006469: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 13/13 zsI ==> 0 file-storage-3982 [006] d... 131.006680: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.007367: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 131.007398: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.008050: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 131.008098: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.008723: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 131.008767: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/8192 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.009231: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 8192/8192 zsI ==> 0 file-storage-3982 [006] d... 131.009522: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.009946: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 131.009983: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 irq/16-dwc3-3983 [004] d... 131.010617: dwc3_gadget_giveback: ep1in: req 00000000eccaa10f length 16384/16384 zsI ==> 0 file-storage-3982 [006] d... 131.010663: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 file-storage-3982 [006] .... 131.065766: dwc3_ep_dequeue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 file-storage-3982 [006] .... 135.315162: dwc3_ep_dequeue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 We need to see what happens after we print this extra dequeue line here. We also need to know why upper layer is dequeueing the same request twice. It seems like we're dealing with a gadget driver bug. Granted, we need to protect ourselves from it, but we don't know why upper layer dequeues the same thing twice; nor do we know what happens after this last line here. Looking at the last few lines relating to ep1in we have: > file-storage-3982 [006] d... 131.010663: dwc3_ep_queue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 queue the request in question > file-storage-3982 [006] d... 131.010667: dwc3_prepare_trb: ep1in: trb 000000002ab8a1f9 buf 00000000bde24000 size 16384 ctrl 00000811 (Hlcs:sC:normal) prepare a TRB for it > file-storage-3982 [006] d... 131.010674: dwc3_gadget_ep_cmd: ep1in: cmd 'Update Transfer' [30007] params 00000000 00000000 00000000 --> status: Successful no status update transfer > irq/16-dwc3-3983 [004] d... 131.010942: dwc3_event: event (00004086): ep1in: Transfer In Progress [0] (sIm) transfer in progress > irq/16-dwc3-3983 [004] d... 131.010942: dwc3_complete_trb: ep1in: trb 00000000426cd8cf buf 00000000bde20000 size 0 ctrl 00000810 (hlcs:sC:normal) complete a TRB in full. Where is our giveback? This is not a chained TRB, so we should have a call to giveback here. This is already fishy. > file-storage-3982 [006] d... 131.010994: dwc3_ep_queue: ep1in: req 00000000f7765e56 length 0/16384 zsI ==> -115 new request coming in > file-storage-3982 [006] d... 131.010998: dwc3_prepare_trb: ep1in: trb 0000000065d9143d buf 00000000bde28000 size 16384 ctrl 00000811 (Hlcs:sC:normal) another TRB > file-storage-3982 [006] d... 131.011005: dwc3_gadget_ep_cmd: ep1in: cmd 'Update Transfer' [30007] params 00000000 00000000 00000000 --> status: Successful update transfer > file-storage-3982 [006] .... 131.065558: dwc3_ep_dequeue: ep1in: req 00000000f7765e56 length 0/16384 zsI ==> -115 a call to dequeue for the new request. Request is moved to cancelled list > file-storage-3982 [006] d... 131.065687: dwc3_gadget_ep_cmd: ep1in: cmd 'End Transfer' [30d08] params 00000000 00000000 00000000 --> status: Successful End transfer issued > irq/16-dwc3-3983 [004] d... 131.065729: dwc3_event: event (080301c6): ep1in: Endpoint Command Complete End transfer completed > irq/16-dwc3-3983 [004] d... 131.065731: dwc3_gadget_giveback: ep1in: req 00000000f7765e56 length 0/16384 zsI ==> -104 for each request is cancelled_list, giveback. > file-storage-3982 [006] .... 131.065766: dwc3_ep_dequeue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 > file-storage-3982 [006] .... 135.315162: dwc3_ep_dequeue: ep1in: req 00000000eccaa10f length 0/16384 zsI ==> -115 Try to dequeue a request that was already completed. Odd. Why are we missing a call to giveback?? >> What we need is a way to ensure that either END_TRANSFER happens, or >> prove that for cases where END_TRANSFER isn't issued, is because of >> suspend/resume or driver removal. In either case, we can safely giveback >> the TRBs since core will be reinitialized later on. >> >> Still, if you have some tracepoints, I'd like to see why is it that core >> doesn't issue END_TRANSFER. >> > > The driver does send END_TRANSFER command. So, the issue here is the > upperlayer driver can call dequeue() after the END_TRANSFER command was > completed and before any new START_TRANSFER command. The request was not > added to the cancelled list soon enough for it to be cleaned up on > command completion. > > How about giveback the request immediately if endpoint hasn't started > and no pending_end_transfer? we already do that. Here's ep_dequeue with comments: > static int dwc3_gadget_ep_dequeue(struct usb_ep *ep, > struct usb_request *request) > { > struct dwc3_request *req = to_dwc3_request(request); > struct dwc3_request *r = NULL; > > struct dwc3_ep *dep = to_dwc3_ep(ep); > struct dwc3 *dwc = dep->dwc; > > unsigned long flags; > int ret = 0; > > trace_dwc3_ep_dequeue(req); print our trace line > spin_lock_irqsave(&dwc->lock, flags); > > list_for_each_entry(r, &dep->pending_list, list) { > if (r == req) > break; if we find the request on pending_list, meaning it wasn't started... > } > > if (r != req) { ... this won't be true... > list_for_each_entry(r, &dep->started_list, list) { > if (r == req) > break; > } > if (r == req) { > /* wait until it is processed */ > dwc3_stop_active_transfer(dep, true); > > if (!r->trb) > goto out0; > > dwc3_gadget_move_cancelled_request(req); > goto out0; > } > dev_err(dwc->dev, "request %pK was not queued to %s\n", > request, ep->name); > ret = -EINVAL; > goto out0; > } > > dwc3_gadget_giveback(dep, req, -ECONNRESET); ... and we will run this. Request is only moved to cancelled_list if it was previously on started_list. > out0: > spin_unlock_irqrestore(&dwc->lock, flags); > > return ret; > } -- balbi
Attachment:
signature.asc
Description: PGP signature