Hi Thinh, On 3/7/2022 3:51 PM, Thinh Nguyen wrote: > Hi Wesley, > > Wesley Cheng wrote: >> Hi Thinh, >> >> On 3/4/2022 4:53 PM, Thinh Nguyen wrote: >>> Wesley Cheng wrote: >>>> Hi Thinh, >>>> >>>> On 2/28/2022 7:02 PM, Thinh Nguyen wrote: >>>>> Wesley Cheng wrote: >>>>>> Hi Thinh, >>>>>> >>>>>> On 2/28/2022 5:09 PM, Thinh Nguyen wrote: >>>>>>> Hi Wesley, >>>>>>> >>>>> >>>>> <snip> >>>>> >>>>>>> >>>>>>> [ 2181.481956865 0x9dc63f265] dbg_complete: ep6in: trb ffffffc01e7f52a0 (E43:D43) buf 00000000ebaf0000 size 1x 0 ctrl 00000810 (hlcs:sC:normal) >>>>>>> [ 2181.482044730 0x9dc63f8fc] dbg_gadget_giveback: ep6in: req ffffff8860657500 length 8/8 zsI ==> 0 >>>>>>> [ 2181.482222490 0x9dc640651] event (0000c040): ep0out: Transfer Complete (sIL) [Setup Phase] >>>>>>> [ 2181.482273271 0x9dc640a20] dbg_trace_log_ctrl: Get Interface Status(Intf = 4, Length = 20) >>>>>>> [ 2181.482334782 0x9dc640ebc] dbg_ep_queue: ep6in: req ffffff8860657500 length 0/8 zsI ==> -115 >>>>>>> [ 2181.482357386 0x9dc64106e] dbg_prepare: ep6in: trb ffffffc01e7f52b0 (E44:D43) buf 00000000ea578000 size 1x 8 ctrl 00000811 (Hlcs:sC:normal) >>>>>>> [ 2181.482391865 0x9dc641304] dbg_send_ep_cmd: ep6in: cmd 'Update Transfer' [d0007] params 00000000 00000000 00000000 --> status: Successful >>>>>>> [ 2181.482485615 0x9dc641a0d] dbg_send_ep_cmd: ep0out: cmd 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status: Successful >>>>>>> [ 2181.482565303 0x9dc642006] event (000010c0): ep0out: Transfer Not Ready [0] (Not Active) [Data Phase] >>>>>>> [ 2181.482719417 0x9dc642b96] event (00002040): ep0out: Transfer Complete (Sil) [Data Phase] >>>>>>> [ 2181.482814938 0x9dc6432c0] dbg_gadget_giveback: ep0out: req ffffff87df84d900 length 20/20 zsI ==> 0 >>>>>>> [ 2181.482926084 0x9dc643b16] event (000020c2): ep0in: Transfer Not Ready [0] (Not Active) [Status Phase] >>>>>>> [ 2181.483024261 0x9dc644272] dbg_send_ep_cmd: ep0in: cmd 'Start Transfer' [406] params 00000000 efffa000 00000000 --> status: Successful >>>>>>> >>>>>>> The control status isn't completed here. >>>>>>> >>>>>>> [ 2181.483069521 0x9dc6445d7] dbg_ep_dequeue: ep2in: req ffffff879f5a8b00 length 0/63680 zsI ==> -115 >>>>>>> [ 2181.496068792 0x9dc6814c9] dbg_send_ep_cmd: ep2in: cmd 'End Transfer' [50d08] params 00000000 00000000 00000000 --> status: Timed Out >>>>>>> >>>>>>> But the dequeue may come when host already sent a new Setup packet. >>>>>>> The ep0out hasn't started yet at the point. >>>>>>> >>>>>>> Due to various system latency, I can see that this can happen when >>>>>>> the dwc3 driver hasn't received the interrupt notified the status stage >>>>>>> event yet. >>>>>>> >>>>>>> If that's the case, the host may have already sent the Setup packet >>>>>>> at this point. So the End Transfer may get stuck if the Setup packet >>>>>>> isn't DMA out yet. >>>>>>> >>>>>>> Can you try the change below to see if it resolves the issue? >>>>>> Thanks, Thinh. Sure I'll give it a try with this change. This is very >>>>>> similar to the change proposed here as well: >>>>>> >>>>>> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20220216000835.25400-3-quic_wcheng@xxxxxxxxxxx/__;!!A4F2R9G_pg!KlgSpNELOXQydIQuarA3A4NJXIcvHslXqzOdBwYqUIR97Mqdp8zdyezhOC9EJ6UqxLxM$ >>>>>> >>>>> >>>>> Not sure if this completely resolves the issue here. The change seems to >>>>> issue the End Transfer command before Start Transfer for the next Setup >>>>> stage completes. Also it's missing some checks for async calls to the >>>>> endpoint that's pending dequeue. Also, we may not need to wait for End >>>>> Transfer command to time out if we know the condition to avoid. >>>>> >>>>>> One thing to mention is that, I'm not sure how dependable checking soley >>>>>> the ep0state would be. I've seen some scenarios where we'd run into the >>>>>> end transfer timeout during the time between inspecting the SETUP packet >>>>>> (dwc3_ep0_inspect_setup()) and when the data phase is queued. The >>>>>> timing of the data phase can potentially differ if it is a vendor >>>>>> specific control request. >>>>> >>>>> This timeout should only apply to Setup packet and Setup stage. Even if >>>>> it's vendor specific control request, it should be fine. Host should not >>>>> issue a Setup packet until it receives a status stage (unless there's a >>>>> disconnect in the middle of a control transfer, but that's a different >>>>> issue). >>>>> >>>>> If you do see a problem. We can take a look further. >>>>> >>>> So far so good w/ the testing. Had to make a small change in your patch >>>> to fix a typo: >>>> if (!(dwc3_ep->flags & DWC3_EP_DELAY_STOP)) >>>> continue; >>>> >>>> dwc3_ep->flags &= ~DWC3_EP_DELAY_STOP; >>>> ret = dwc3_stop_active_transfer(dwc3_ep, true, true); >>>> >>>> Was using dep instead of dwc3_ep. Will let this run over the weekend >>>> and get back to you. >>>> >>> >>> Ok. This seems to confirm my suspicion. Can you update the patch with >>> the following adjustment: >>> >>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >>> index 3e75eaa13abc..c3f7529f64fc 100644 >>> --- a/drivers/usb/dwc3/gadget.c >>> +++ b/drivers/usb/dwc3/gadget.c >>> @@ -2309,6 +2309,10 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep, >>> if (r == req) { >>> struct dwc3_request *t; >>> >>> + if (dwc->ep0state != EP0_SETUP_PHASE && >>> + !dwc->delayed_status) >>> + dep->flags |= DWC3_EP_DELAY_STOP; >>> + >>> /* wait until it is processed */ >>> dwc3_stop_active_transfer(dep, true, true); >>> >>> This is to avoid a case if the function driver has some dependency for >>> requests to return before sending the control status using delayed >>> status, which can cause a hang. >>> >>> We only need to make sure not to issue End Transfer after the status >>> transfer started and before its completion interrupt, which may prevent >>> the driver from starting the Setup stage. >>> >> >> Added the above change, and tested it over the weekend and it was >> working well for me. However, I wasn't able to really test the >> delayed_status flag too much, since we don't have a function driver that >> utilizes the USB_GADGET_DELAYED_STATUS too much. (we only have a FFS >> interface, which will do it during enum, which is part of the test case >> I ran) >> > > Thanks for the test. The delayed status check is only meant for a > special case if the function driver waits for dequeued requests to > return before sending the control status. No function driver is doing > this at the moment, but I want to put the check here anyway for robustness. > >> Would it also make sense to check for the dwc->setup_packet_pending flag >> as well in the same IF condition? That would mean that there was a >> SETUP packet cached in the controller, which would need to be handled. >> I heard from our CC w/ Synopsys that we need to ensure any pending SETUP >> packets stored in internal memory needed to be cleared as well before >> issuing the endxfer. > > Currently the dwc3 driver doesn't handle setup_packet_pending. It only > uses that flag to handle some quirk. It should be fine when it is > implemented (at some point eventually :)). > > If the EP0_SETUP_PHASE flag is set, that means that the driver had setup > the TRB for the Setup stage, so the Setup packet can be DMA'ed out > within the End Transfer timeout. When the handling of > setup_packet_pending is implemented, it should properly update the ep0state. > > Note: hitting the pending setup packet should be rare. It can happen when > 1) Host aborts the control transfer for some reason and start a new one > 2) The device is disconnected in the middle of the control transfer > > If you plan to implement/handle this scenario, I'll be happy to review > your changes. > > Thanks for the explanation, Thinh. Let me see if I can revisit the setup_packet_pending scenario we ran into in the past, and if its still applicable, I'll submit a separate fix for it. Its been a few years since we've seen that :). >> >> This sounds similar to your statement previously about if the SETUP >> packet wasn't DMA'ed out yet. >> >> Thanks >> Wesley Cheng >> > > Please submit this fix separately from your other RFC patches so it can > go into the driver. > Sounds good. Will take this change outside the RFC series and submit it to go in. Thanks Wesley Cheng -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project