Hi Wesley, On 7/20/2022, Wesley Cheng wrote: > Hi Thinh, > > On 7/14/2022 10:38 AM, Thinh Nguyen wrote: >> On 7/12/2022, Wesley Cheng wrote: >>> Local interrupts are currently being disabled as part of aquiring the >>> spin lock before issuing the endxfer command. Leave interrupts >>> enabled, so >>> that EP0 events can continue to be processed. Also, ensure that >>> there are >>> no pending interrupts before attempting to handle any soft >>> connect/disconnect. >>> >>> Fixes: 861c010a2ee1 ("usb: dwc3: gadget: Refactor pullup()") >>> Signed-off-by: Wesley Cheng <quic_wcheng@xxxxxxxxxxx> >>> --- >>> drivers/usb/dwc3/gadget.c | 21 ++++++++++++--------- >>> 1 file changed, 12 insertions(+), 9 deletions(-) >>> >>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >>> index a455f8d4631d..ee85b773e3fe 100644 >>> --- a/drivers/usb/dwc3/gadget.c >>> +++ b/drivers/usb/dwc3/gadget.c >>> @@ -1674,6 +1674,7 @@ static int __dwc3_gadget_get_frame(struct dwc3 >>> *dwc) >>> static int __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool >>> force, bool interrupt) >>> { >>> struct dwc3_gadget_ep_cmd_params params; >>> + struct dwc3 *dwc = dep->dwc; >>> u32 cmd; >>> int ret; >>> @@ -1682,7 +1683,9 @@ static int >>> __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool int >>> cmd |= interrupt ? DWC3_DEPCMD_CMDIOC : 0; >>> cmd |= DWC3_DEPCMD_PARAM(dep->resource_index); >>> memset(¶ms, 0, sizeof(params)); >>> + spin_unlock(&dwc->lock); >>> ret = dwc3_send_gadget_ep_cmd(dep, cmd, ¶ms); >>> + spin_lock(&dwc->lock); >>> WARN_ON_ONCE(ret); >>> dep->resource_index = 0; >>> @@ -2029,12 +2032,11 @@ static int dwc3_gadget_ep_dequeue(struct >>> usb_ep *ep, >>> struct dwc3_ep *dep = to_dwc3_ep(ep); >>> struct dwc3 *dwc = dep->dwc; >>> - unsigned long flags; >>> int ret = 0; >>> trace_dwc3_ep_dequeue(req); >>> - spin_lock_irqsave(&dwc->lock, flags); >>> + spin_lock(&dwc->lock); >>> list_for_each_entry(r, &dep->cancelled_list, list) { >>> if (r == req) >>> @@ -2073,7 +2075,7 @@ static int dwc3_gadget_ep_dequeue(struct >>> usb_ep *ep, >>> request, ep->name); >>> ret = -EINVAL; >>> out: >>> - spin_unlock_irqrestore(&dwc->lock, flags); >>> + spin_unlock(&dwc->lock); >>> return ret; >>> } >>> @@ -2489,9 +2491,7 @@ static int __dwc3_gadget_start(struct dwc3 *dwc); >>> static int dwc3_gadget_soft_disconnect(struct dwc3 *dwc) >>> { >>> - unsigned long flags; >>> - >>> - spin_lock_irqsave(&dwc->lock, flags); >>> + spin_lock(&dwc->lock); >>> dwc->connected = false; >>> /* >>> @@ -2506,10 +2506,10 @@ static int >>> dwc3_gadget_soft_disconnect(struct dwc3 *dwc) >>> reinit_completion(&dwc->ep0_in_setup); >>> - spin_unlock_irqrestore(&dwc->lock, flags); >>> + spin_unlock(&dwc->lock); >>> ret = wait_for_completion_timeout(&dwc->ep0_in_setup, >>> msecs_to_jiffies(DWC3_PULL_UP_TIMEOUT)); >>> - spin_lock_irqsave(&dwc->lock, flags); >>> + spin_lock(&dwc->lock); >>> if (ret == 0) >>> dev_warn(dwc->dev, "timed out waiting for SETUP >>> phase\n"); >>> } >>> @@ -2523,7 +2523,7 @@ static int dwc3_gadget_soft_disconnect(struct >>> dwc3 *dwc) >>> */ >>> dwc3_stop_active_transfers(dwc); >>> __dwc3_gadget_stop(dwc); >>> - spin_unlock_irqrestore(&dwc->lock, flags); >>> + spin_unlock(&dwc->lock); >>> /* >>> * Note: if the GEVNTCOUNT indicates events in the event >>> buffer, the >>> @@ -2569,6 +2569,8 @@ static int dwc3_gadget_pullup(struct >>> usb_gadget *g, int is_on) >>> return 0; >>> } >>> + synchronize_irq(dwc->irq_gadget); >>> + >>> if (!is_on) { >>> ret = dwc3_gadget_soft_disconnect(dwc); >>> } else { >>> @@ -3729,6 +3731,7 @@ void dwc3_stop_active_transfer(struct dwc3_ep >>> *dep, bool force, >>> */ >>> __dwc3_stop_active_transfer(dep, force, interrupt); >>> + >>> } >>> static void dwc3_clear_stall_all_ep(struct dwc3 *dwc) >> >> Hi Greg, >> >> Please don't pick up this patch yet. We're still in discussion with >> this. I have some concern with unlocking/locking when sending End >> Transfer command. For example, this patch may cause issues with >> DWC3_EP_END_TRANSFER_PENDING checks. >> >> Hi Wesley, >> >> Did you try out my suggestion yet? >> > > Just providing a quick update. > > So with your suggestion, I was able to consistently reproduce the > controller halt issue after a day or so of testing. However, when I > took a further look, I believe the problem is due to the DWC3 event > handler: > > static void dwc3_endpoint_interrupt(struct dwc3 *dwc, > const struct dwc3_event_depevt *event) > { > ... > if (!(dep->flags & DWC3_EP_ENABLED)) { > if (!(dep->flags & DWC3_EP_TRANSFER_STARTED)) > return; > > /* Handle only EPCMDCMPLT when EP disabled */ > if (event->endpoint_event != DWC3_DEPEVT_EPCMDCMPLT) > return; > } > > The soft disconnect routine reached to the run/stop polling point, and > I could see that DWC3_EP_DELAYED_STOP was set, and we got a > xfercomplete event for the STATUS phase. However, since we exit early > in the event handler (due to __dwc3_gadget_stop() being called and > disabling EP0), the STATUS complete is never handled, and we do not > issue the endxfer command. > > I don't think I saw this issue with my change, as we allowed the > STATUS phase handling to happen BEFORE gadget stop was called (since I > released the lock in the stop active transfers API). > > However, I think even with my approach, we'd eventually run into a > possibility of this issue, as we aren't truly handling EP0 events > while polling for the halted status due to the above. It was just > reducing the chances. The scenario of this issue is coming because > the host took a long time to complete the STATUS phase, so we ran into > a "timed out waiting for SETUP phase," which allowed us to call the > run/stop routine while we were not yet in the SETUP phase. > > I'm currently running a change to add a EP num check to this IF > condition: > > if ((epnum > 1) && !(dep->flags & DWC3_EP_ENABLED)) { > if (!(dep->flags & DWC3_EP_TRANSFER_STARTED)) > return; > > /* Handle only EPCMDCMPLT when EP disabled */ > if (event->endpoint_event != DWC3_DEPEVT_EPCMDCMPLT) > return; > } > Ah... good catch! Thanks for all the testings. BR, Thinh