Wesley Cheng wrote: > Hi Thinh, > > On 2/28/2022 7:02 PM, Thinh Nguyen wrote: >> Wesley Cheng wrote: >>> Hi Thinh, >>> >>> On 2/28/2022 5:09 PM, Thinh Nguyen wrote: >>>> Hi Wesley, >>>> >> >> <snip> >> >>>> >>>> [ 2181.481956865 0x9dc63f265] dbg_complete: ep6in: trb ffffffc01e7f52a0 (E43:D43) buf 00000000ebaf0000 size 1x 0 ctrl 00000810 (hlcs:sC:normal) >>>> [ 2181.482044730 0x9dc63f8fc] dbg_gadget_giveback: ep6in: req ffffff8860657500 length 8/8 zsI ==> 0 >>>> [ 2181.482222490 0x9dc640651] event (0000c040): ep0out: Transfer Complete (sIL) [Setup Phase] >>>> [ 2181.482273271 0x9dc640a20] dbg_trace_log_ctrl: Get Interface Status(Intf = 4, Length = 20) >>>> [ 2181.482334782 0x9dc640ebc] dbg_ep_queue: ep6in: req ffffff8860657500 length 0/8 zsI ==> -115 >>>> [ 2181.482357386 0x9dc64106e] dbg_prepare: ep6in: trb ffffffc01e7f52b0 (E44:D43) buf 00000000ea578000 size 1x 8 ctrl 00000811 (Hlcs:sC:normal) >>>> [ 2181.482391865 0x9dc641304] dbg_send_ep_cmd: ep6in: cmd 'Update Transfer' [d0007] params 00000000 00000000 00000000 --> status: Successful >>>> [ 2181.482485615 0x9dc641a0d] dbg_send_ep_cmd: ep0out: cmd 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status: Successful >>>> [ 2181.482565303 0x9dc642006] event (000010c0): ep0out: Transfer Not Ready [0] (Not Active) [Data Phase] >>>> [ 2181.482719417 0x9dc642b96] event (00002040): ep0out: Transfer Complete (Sil) [Data Phase] >>>> [ 2181.482814938 0x9dc6432c0] dbg_gadget_giveback: ep0out: req ffffff87df84d900 length 20/20 zsI ==> 0 >>>> [ 2181.482926084 0x9dc643b16] event (000020c2): ep0in: Transfer Not Ready [0] (Not Active) [Status Phase] >>>> [ 2181.483024261 0x9dc644272] dbg_send_ep_cmd: ep0in: cmd 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status: Successful >>>> >>>> The control status isn't completed here. >>>> >>>> [ 2181.483069521 0x9dc6445d7] dbg_ep_dequeue: ep2in: req ffffff879f5a8b00 length 0/63680 zsI ==> -115 >>>> [ 2181.496068792 0x9dc6814c9] dbg_send_ep_cmd: ep2in: cmd 'End Transfer' [50d08] params 00000000 00000000 00000000 --> status: Timed Out >>>> >>>> But the dequeue may come when host already sent a new Setup packet. >>>> The ep0out hasn't started yet at the point. >>>> >>>> Due to various system latency, I can see that this can happen when >>>> the dwc3 driver hasn't received the interrupt notified the status stage >>>> event yet. >>>> >>>> If that's the case, the host may have already sent the Setup packet >>>> at this point. So the End Transfer may get stuck if the Setup packet >>>> isn't DMA out yet. >>>> >>>> Can you try the change below to see if it resolves the issue? >>> Thanks, Thinh. Sure I'll give it a try with this change. This is very >>> similar to the change proposed here as well: >>> >>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20220216000835.25400-3-quic_wcheng@xxxxxxxxxxx/__;!!A4F2R9G_pg!KlgSpNELOXQydIQuarA3A4NJXIcvHslXqzOdBwYqUIR97Mqdp8zdyezhOC9EJ6UqxLxM$ >>> >> >> Not sure if this completely resolves the issue here. The change seems to >> issue the End Transfer command before Start Transfer for the next Setup >> stage completes. Also it's missing some checks for async calls to the >> endpoint that's pending dequeue. Also, we may not need to wait for End >> Transfer command to time out if we know the condition to avoid. >> >>> One thing to mention is that, I'm not sure how dependable checking soley >>> the ep0state would be. I've seen some scenarios where we'd run into the >>> end transfer timeout during the time between inspecting the SETUP packet >>> (dwc3_ep0_inspect_setup()) and when the data phase is queued. The >>> timing of the data phase can potentially differ if it is a vendor >>> specific control request. >> >> This timeout should only apply to Setup packet and Setup stage. Even if >> it's vendor specific control request, it should be fine. Host should not >> issue a Setup packet until it receives a status stage (unless there's a >> disconnect in the middle of a control transfer, but that's a different >> issue). >> >> If you do see a problem. We can take a look further. >> > So far so good w/ the testing. Had to make a small change in your patch > to fix a typo: > if (!(dwc3_ep->flags & DWC3_EP_DELAY_STOP)) > continue; > > dwc3_ep->flags &= ~DWC3_EP_DELAY_STOP; > ret = dwc3_stop_active_transfer(dwc3_ep, true, true); > > Was using dep instead of dwc3_ep. Will let this run over the weekend > and get back to you. > Ok. This seems to confirm my suspicion. Can you update the patch with the following adjustment: diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 3e75eaa13abc..c3f7529f64fc 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -2309,6 +2309,10 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep, if (r == req) { struct dwc3_request *t; + if (dwc->ep0state != EP0_SETUP_PHASE && + !dwc->delayed_status) + dep->flags |= DWC3_EP_DELAY_STOP; + /* wait until it is processed */ dwc3_stop_active_transfer(dep, true, true); This is to avoid a case if the function driver has some dependency for requests to return before sending the control status using delayed status, which can cause a hang. We only need to make sure not to issue End Transfer after the status transfer started and before its completion interrupt, which may prevent the driver from starting the Setup stage. Thanks, Thinh