On 9/5/2024 5:56 AM, Thinh Nguyen wrote: > On Wed, Sep 04, 2024, Selvarasu Ganesan wrote: >> On 9/4/2024 6:33 AM, Thinh Nguyen wrote: >>> On Mon, Sep 02, 2024, Selvarasu Ganesan wrote: >>>> I would like to reconfirm from our end that in our failure scenario, we >>>> observe that DWC3_EVENT_PENDING is set in evt->flags when the dwc3 >>>> resume sequence is executed, and the dwc->pending_events flag is not >>>> being set. >>>> >>> If the controller is stopped, no event is generated until it's restarted >>> again. (ie, you should not see GEVNTCOUNT updated after clearing >>> DCTL.run_stop). If there's no event, no interrupt assertion should come >>> from the controller. >>> >>> If the pending_events is not set and you still see this failure, then >>> likely that the controller had started, and the interrupt is generated >>> from the controller event. This occurs along with the interrupt >>> generated from your connection notification from your setup. >> >> I completely agree. My discussion revolves around the handling of the >> DWC3_EVENT_PENDING flag in all situations. The purpose of using this >> flag is to prevent the processing of new events if an existing event is >> still being processed. This flag is set in the top-half interrupt >> handler and cleared at the end of the bottom-half handler. >> >> Now, let's consider scenarios where the bottom half is not scheduled, >> and a USB reconnect occurs. In this case, there is a possibility that >> the interrupt line is unmasked in dwc3_event_buffers_setup, and the USB >> controller begins posting new events. The top-half interrupt handler >> checks for the DWC3_EVENT_PENDING flag and returns IRQ_HANDLED without >> processing any new events. However, the USB controller continues to post >> interrupts until they are acknowledged. >> >> Please review the complete sequence once with DWC3_EVENT_PENDING flag. >> >> My proposal is to clear or reset the DWC3_EVENT_PENDING flag when >> unmasking the interrupt line dwc3_event_buffers_setup, apart from >> bottom-half handler. Clearing the DWC3_EVENT_PENDING flag in >> dwc3_event_buffers_setup does not cause any harm, as we have implemented >> a temporary workaround in our test setup to prevent IRQ storms. >> >> >> >> Working scenarios: >> ================== >> 1. Top-half handler: >> a. if (evt->flags & DWC3_EVENT_PENDING) >> return IRQ_HANDLED; >> b. Set DWC3_EVENT_PENDING flag >> c. Masking interrupt line >> >> 2. Bottom-half handler: >> a. Un-masking interrupt line >> b. Clear DWC3_EVENT_PENDING flag >> >> Failure scenarios: >> ================== >> 1. Top-half handler: >> a. if (evt->flags & DWC3_EVENT_PENDING) >> return IRQ_HANDLED; >> b. Set DWC3_EVENT_PENDING flag >> c. Masking interrupt line > For DWC3_EVENT_PENDING flag to be set at this point (before we start the > controller), that means that the GEVNTCOUNT was not 0 after > soft-disconnect and that the pm_runtime_suspended() must be false. In the top-half code where we set the DWC3_EVENT_PENDING flag, we acknowledge GEVNTCOUNT. Therefore, I think it is not necessary for GEVNTCOUNT to have a non-zero value until a new event occurs. In fact, when we tried to print GEVNTCOUNT in a non-interrupt context, we found that it was zero, where we received DWC3_EVENT_PENDING being set in non-interrupt context. > >> 2. No Bottom-half scheduled: > Why is the bottom-half not scheduled? Or do you mean it hasn't woken up > yet before the next top-half coming? In very rare cases, it is possible in our platform that the CPU may not be able to schedule the bottom half of the dwc3 interrupt because a work queue lockup has occurred on the same CPU that is attempting to schedule the dwc3 thread interrupt. In this case Yes, the bottom-half handler hasn't woken up, then initiate an IRQ storm for new events after the controller restarts, resulting in no more bottom-half scheduling due to the CPU being stuck in processing continuous interrupts and return IRQ_HANDLED by checking if (evt->flags & DWC3_EVENT_PENDING). > >> 3. USB reconnect: dwc3_event_buffers_setup >> a. Un-masking interrupt line > Do we know that the GEVNTCOUNT is non-zero before starting the > controller again? The GEVNTCOUNT value showing as zero that we confirmed by adding debug message here. > >> 4. Continuous interrupts : Top-half handler: >> a. if (evt->flags & DWC3_EVENT_PENDING) >> return IRQ_HANDLED; >> >> a. if (evt->flags & DWC3_EVENT_PENDING) >> return IRQ_HANDLED; >> >> a. if (evt->flags & DWC3_EVENT_PENDING) >> return IRQ_HANDLED; >> ..... >> >> ..... >> >> ..... >> Sure, I can try implementing the proposed code modifications in our testing environment. But, I am uncertain about how these changes will effectively prevent an IRQ storm when the USB controller sequence restarts with the DWC3_EVENT_PENDING. The following code will only execute until the DWC3_EVENT_PENDING is cleared, at which point the previous bottom-half will not be scheduled. Please correct me if i am wrong in my above understanding. static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt) { struct dwc3 *dwc = evt->dwc; u32 amount; u32 count; ..... ..... ..... if (evt->flags & DWC3_EVENT_PENDING) return IRQ_HANDLED; } Thanks, Selva > I don't want the driver to process any stale event after a > soft-disconnect. Can we try this instead: > > diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c > index ccc3895dbd7f..a1e6ba92e808 100644 > --- a/drivers/usb/dwc3/core.c > +++ b/drivers/usb/dwc3/core.c > @@ -544,6 +544,7 @@ static int dwc3_alloc_event_buffers(struct dwc3 *dwc, unsigned int length) > int dwc3_event_buffers_setup(struct dwc3 *dwc) > { > struct dwc3_event_buffer *evt; > + u32 reg; > > if (!dwc->ev_buf) > return 0; > @@ -556,7 +557,10 @@ int dwc3_event_buffers_setup(struct dwc3 *dwc) > upper_32_bits(evt->dma)); > dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), > DWC3_GEVNTSIZ_SIZE(evt->length)); > - dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), 0); > + > + /* Clear any stale event */ > + reg = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0)); > + dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), reg); > > return 0; > } > @@ -2484,7 +2488,11 @@ static int dwc3_runtime_resume(struct device *dev) > > switch (dwc->current_dr_role) { > case DWC3_GCTL_PRTCAP_DEVICE: > - dwc3_gadget_process_pending_events(dwc); > + if (dwc->pending_events) { > + pm_runtime_put(dwc->dev); > + dwc->pending_events = false; > + enable_irq(dwc->irq_gadget); > + } > break; > case DWC3_GCTL_PRTCAP_HOST: > default: > @@ -2574,6 +2582,12 @@ static void dwc3_complete(struct device *dev) > static const struct dev_pm_ops dwc3_dev_pm_ops = { > SET_SYSTEM_SLEEP_PM_OPS(dwc3_suspend, dwc3_resume) > .complete = dwc3_complete, > + > + /* > + * Runtime suspend halts the controller on disconnection. It replies on > + * platforms with custom connection notification to start the controller > + * again. > + */ > SET_RUNTIME_PM_OPS(dwc3_runtime_suspend, dwc3_runtime_resume, > dwc3_runtime_idle) > }; > diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h > index 1e561fd8b86e..2fa3fd55eb32 100644 > --- a/drivers/usb/dwc3/core.h > +++ b/drivers/usb/dwc3/core.h > @@ -1673,7 +1673,6 @@ static inline void dwc3_otg_host_init(struct dwc3 *dwc) > #if !IS_ENABLED(CONFIG_USB_DWC3_HOST) > int dwc3_gadget_suspend(struct dwc3 *dwc); > int dwc3_gadget_resume(struct dwc3 *dwc); > -void dwc3_gadget_process_pending_events(struct dwc3 *dwc); > #else > static inline int dwc3_gadget_suspend(struct dwc3 *dwc) > { > @@ -1684,10 +1683,6 @@ static inline int dwc3_gadget_resume(struct dwc3 *dwc) > { > return 0; > } > - > -static inline void dwc3_gadget_process_pending_events(struct dwc3 *dwc) > -{ > -} > #endif /* !IS_ENABLED(CONFIG_USB_DWC3_HOST) */ > > #if IS_ENABLED(CONFIG_USB_DWC3_ULPI) > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > index 89fc690fdf34..a525f7ea5886 100644 > --- a/drivers/usb/dwc3/gadget.c > +++ b/drivers/usb/dwc3/gadget.c > @@ -4490,6 +4490,17 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt) > return IRQ_HANDLED; > > count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0)); > + > + /* > + * If the controller is halted, the event count is stale/invalid. Ignore > + * them. This happens if the interrupt assertion is from an out-of-band > + * resume notification. > + */ > + if (!dwc->pullups_connected && count) { > + dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count); > + return IRQ_HANDLED; > + } > + > count &= DWC3_GEVNTCOUNT_MASK; > if (!count) > return IRQ_NONE; > @@ -4735,14 +4746,3 @@ int dwc3_gadget_resume(struct dwc3 *dwc) > > return dwc3_gadget_soft_connect(dwc); > } > - > -void dwc3_gadget_process_pending_events(struct dwc3 *dwc) > -{ > - if (dwc->pending_events) { > - dwc3_interrupt(dwc->irq_gadget, dwc->ev_buf); > - dwc3_thread_interrupt(dwc->irq_gadget, dwc->ev_buf); > - pm_runtime_put(dwc->dev); > - dwc->pending_events = false; > - enable_irq(dwc->irq_gadget); > - } > -} > > > --- > > Thanks, > Thinh