On Mon, Sep 02, 2024, Selvarasu Ganesan wrote: > > On 8/31/2024 6:20 AM, Thinh Nguyen wrote: > > Hi Selvarasu, > > > > On Fri, Aug 30, 2024, Selvarasu Ganesan wrote: > >> On 8/10/2024 5:12 AM, Thinh Nguyen wrote: > >>> On Thu, Aug 08, 2024, Selvarasu Ganesan wrote: > >>>> On 8/8/2024 6:45 AM, Thinh Nguyen wrote: > >>>>> On Wed, Aug 07, 2024, Selvarasu Ganesan wrote: > >>>>>> On 8/7/2024 6:08 AM, Thinh Nguyen wrote: > >>>>>>> On Fri, Jul 19, 2024, Selvarasu Ganesan wrote: > >>>>>>>> In certain scenarios, there is a chance that the CPU may not be > >>>>>>>> scheduled the bottom half of dwc3 interrupt. This is because the CPU > >>>>>>>> may hang up where any work queue lockup has happened for the same CPU > >>>>>>>> that is trying to schedule the dwc3 thread interrupt. In this scenario, > >>>>>>>> the USB can enter runtime suspend as the bus may idle for a longer time > >>>>>>>> , or user can reconnect the USB cable. Then, the dwc3 event interrupt > >>>>>>>> can be enabled when runtime resume is happening with regardless of the > >>>>>>>> previous event status. This can lead to a dwc3 IRQ storm due to the > >>>>>>>> return from the interrupt handler by checking only the evt->flags as > >>>>>>>> DWC3_EVENT_PENDING, where the same flag was set as DWC3_EVENT_PENDING > >>>>>>>> in previous work queue lockup. > >>>>>>>> Let's consider the following sequences in this scenario, > >>>>>>>> > >>>>>>>> Call trace of dwc3 IRQ after workqueue lockup scenario > >>>>>>>> ====================================================== > >>>>>>>> IRQ #1: > >>>>>>>> ->dwc3_interrupt() > >>>>>>>> ->dwc3_check_event_buf() > >>>>>>>> ->if (evt->flags & DWC3_EVENT_PENDING) > >>>>>>>> return IRQ_HANDLED; > >>>>>>>> ->evt->flags |= DWC3_EVENT_PENDING; > >>>>>>>> ->/* Disable interrupt by setting DWC3_GEVNTSIZ_INTMASK in > >>>>>>>> DWC3_GEVNTSIZ > >>>>>>>> ->return IRQ_WAKE_THREAD; // No workqueue scheduled for dwc3 > >>>>>>>> thread_fu due to workqueue lockup > >>>>>>>> even after return IRQ_WAKE_THREAD > >>>>>>>> from top-half. > >>>>>>>> > >>>>>>>> Thread #2: > >>>>>>>> ->dwc3_runtime_resume() > >>>>>>>> ->dwc3_resume_common() > >>>>>>>> ->dwc3_gadget_resume() > >>>>>>>> ->dwc3_gadget_soft_connect() > >>>>>>>> ->dwc3_event_buffers_setup() > >>>>>>>> ->/*Enable interrupt by clearing DWC3_GEVNTSIZ_INTMASK in > >>>>>>>> DWC3_GEVNTSIZ*/ > >>>>>>>> > >>>>>>>> Start IRQ Storming after enable dwc3 event in resume path > >>>>>>>> ========================================================= > >>>>>>>> CPU0: IRQ > >>>>>>>> dwc3_interrupt() > >>>>>>>> dwc3_check_event_buf() > >>>>>>>> if (evt->flags & DWC3_EVENT_PENDING) > >>>>>>>> return IRQ_HANDLED; > >>>>>>>> > >>>>>>>> CPU0: IRQ > >>>>>>>> dwc3_interrupt() > >>>>>>>> dwc3_check_event_buf() > >>>>>>>> if (evt->flags & DWC3_EVENT_PENDING) > >>>>>>>> return IRQ_HANDLED; > >>>>>>>> .. > >>>>>>>> .. > >>>>>>>> > >>>>>>>> To fix this issue by avoiding enabling of the dwc3 event interrupt in > >>>>>>>> the runtime resume path if dwc3 event processing is in progress. > >>>>>>>> > >>>>>>>> Signed-off-by: Selvarasu Ganesan <selvarasu.g@xxxxxxxxxxx> > >>>>>>>> --- > >>>>>>>> drivers/usb/dwc3/core.c | 8 ++++++-- > >>>>>>>> 1 file changed, 6 insertions(+), 2 deletions(-) > >>>>>>>> > >>>>>>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c > >>>>>>>> index cb82557678dd..610792a70805 100644 > >>>>>>>> --- a/drivers/usb/dwc3/core.c > >>>>>>>> +++ b/drivers/usb/dwc3/core.c > >>>>>>>> @@ -549,8 +549,12 @@ int dwc3_event_buffers_setup(struct dwc3 *dwc) > >>>>>>>> lower_32_bits(evt->dma)); > >>>>>>>> dwc3_writel(dwc->regs, DWC3_GEVNTADRHI(0), > >>>>>>>> upper_32_bits(evt->dma)); > >>>>>>>> - dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), > >>>>>>>> - DWC3_GEVNTSIZ_SIZE(evt->length)); > >>>>>>>> + > >>>>>>>> + /* Skip enable dwc3 event interrupt if event is processing in middle */ > >>>>>>>> + if (!(evt->flags & DWC3_EVENT_PENDING)) > >>>>>>>> + dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), > >>>>>>>> + DWC3_GEVNTSIZ_SIZE(evt->length)); > >>>>>>>> + > >>>>>>>> dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), 0); > >>>>>>>> > >>>>>>>> return 0; > >>>>>>>> -- > >>>>>>>> 2.17.1 > >>>>>>>> > >>>>>>> We're not waking up from a hibernation. So after a soft-reset and > >>>>>>> resume, the events that weren't processed are stale. They should be > >>>>>>> processed prior to entering suspend or be discarded before resume. > >>>>>>> > >>>>>>> The synchronize_irq() during suspend() was not sufficient to prevent > >>>>>>> this? What are we missing here. > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Thinh > >>>>>> I don’t think the triggering of interrupt would not be stopped even if > >>>>>> do soft reset. It's because of event count is may be valid . > >>>>> Ok. I think I see what you're referring to when you say "event is > >>>>> processing in the middle" now. > >>>>> > >>>>> What you want to check is probably this in dwc3_event_buffers_setup(). > >>>>> Please confirm: > >>>>> > >>>>> if (dwc->pending_events) > >>>>> dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), > >>>>> DWC3_GEVNTSIZ_INTMASK | DWC3_GEVNTSIZ_SIZE(evt->length)); > >>>>> else > >>>>> dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), DWC3_GEVNTSIZ_SIZE(evt->length)); > >>>> Yes, we are expecting the same. But, we must verify the status of > >>>> evt->flags, which will indicate whether the event is currently > >>>> processing in middle or not. The below code is for the reference. > >>>> > >>>> if (!(evt->flags & DWC3_EVENT_PENDING)) > >>>> dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), > >>>> DWC3_GEVNTSIZ_SIZE(evt->length)); > >>>> else > >>>> dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), > >>>> DWC3_GEVNTSIZ_INTMASK | DWC3_GEVNTSIZ_SIZE(evt->length)); > >>> So, this happens while pending_events is set right? I need to review > >>> this runtime suspend flow next week. Something doesn't look right. When > >>> there's a suspend/resume runtime or not, there's a soft disconnect. We > >>> shouldn't be processing any event prior to going into suspend. Also, we > >>> shouldn't be doing soft-disconnect while connected and in operation > >>> unless we specifically tell it to. > >> HI Thinh, > >> > >> Would you be able to review this runtime suspend flow? > >> > >> From our end, after conducting multiple regression tests, we have > >> determined that the resetting of "evt->flags" are sufficient when > >> attempting to enable event IRQ masks instead of enable event IRQ mask > >> based on pending event flags. There is a possibility that reconnecting > >> USB with the host PC may cause event interrupts to be missed by the CPU > >> if disable event IRQ mask. So, The fix should be as follow. Could you > >> please review this once from your end? > >> > >> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c > >> index ccc3895dbd7f..3b2441608e9e 100644 > >> --- a/drivers/usb/dwc3/core.c > >> +++ b/drivers/usb/dwc3/core.c > >> @@ -554,6 +554,15 @@ int dwc3_event_buffers_setup(struct dwc3 *dwc) > >> lower_32_bits(evt->dma)); > >> dwc3_writel(dwc->regs, DWC3_GEVNTADRHI(0), > >> upper_32_bits(evt->dma)); > >> + > >> + /* > >> + * The DWC3_EVENT_PENDING flag is cleared if it has > >> + * already been set when enabling the event IRQ mask > >> + * to prevent possibly of an IRQ storm. > >> + */ > >> + if (evt->flags & DWC3_EVENT_PENDING) > >> + evt->flags &= ~DWC3_EVENT_PENDING; > >> + > >> dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), > >> DWC3_GEVNTSIZ_SIZE(evt->length)); > >> dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), 0); > >> > > Sorry for the delay response. > > > > In addition to that, please rework and rename > > dwc3_gadget_process_pending_event(). We're not supposed to handle any > > left-over event. So remove the manual calls to the interrupt handlers > > there. > Hi Thinh, > > Thanks for your comments. > > Regarding the handling of leftover events during dwc3 runtime resume, I > understand that we are not supposed to handle any leftover events. Would Actually, there's no leftover event. > you be interested in making changes to the code as suggested below? The > reason for removing interrupt handlers from > dwc3_gadget_process_pending_events() is to avoid handling any leftover > events from the last suspend right. If so, based on my understanding, we > can simply remove the use of dwc3_gadget_process_pending_events() > instead of rework on this function since it is not necessary if we > remove the call to interrupt handlers from there. That's right. > > I would like to reconfirm from our end that in our failure scenario, we > observe that DWC3_EVENT_PENDING is set in evt->flags when the dwc3 > resume sequence is executed, and the dwc->pending_events flag is not > being set. > > > > diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c > index ccc3895dbd7f..63e8dd24ad0e 100644 > --- a/drivers/usb/dwc3/core.c > +++ b/drivers/usb/dwc3/core.c > @@ -550,6 +550,15 @@ int dwc3_event_buffers_setup(struct dwc3 *dwc) > > evt = dwc->ev_buf; > evt->lpos = 0; > + > + /* > + * The DWC3_EVENT_PENDING flag is cleared if it has > + * already been set when enabling the event IRQ mask > + * to prevent possibly of an IRQ storm. > + */ > + if (evt->flags & DWC3_EVENT_PENDING) > + evt->flags &= ~DWC3_EVENT_PENDING; > + We actually don't need to clear DWCC3_EVENT_PENDING flag if we remove the below. > dwc3_writel(dwc->regs, DWC3_GEVNTADRLO(0), > lower_32_bits(evt->dma)); > dwc3_writel(dwc->regs, DWC3_GEVNTADRHI(0), > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > index 89fc690fdf34..951c805337c2 100644 > --- a/drivers/usb/dwc3/gadget.c > +++ b/drivers/usb/dwc3/gadget.c > @@ -4739,8 +4739,6 @@ int dwc3_gadget_resume(struct dwc3 *dwc) > void dwc3_gadget_process_pending_events(struct dwc3 *dwc) > { > if (dwc->pending_events) { > - dwc3_interrupt(dwc->irq_gadget, dwc->ev_buf); > - dwc3_thread_interrupt(dwc->irq_gadget, dwc->ev_buf); > pm_runtime_put(dwc->dev); > dwc->pending_events = false; > enable_irq(dwc->irq_gadget); > > > Thanks, > Selva > > > > > On runtime suspend, the device is soft disconnected. So any interrupt > > assertion to notify a new connection must be a custom configuration of > > your platform. No event should be generated while the run_stop bit is > > cleared. > > > > On runtime resume, we will initiate soft-reset and soft-connect to > > restore the run_stop bit. A new connection event will be generated then. > > Agree. > Please try the following with some cleanup and additional comments: diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c index ccc3895dbd7f..f02dae464efe 100644 --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -2484,7 +2484,11 @@ static int dwc3_runtime_resume(struct device *dev) switch (dwc->current_dr_role) { case DWC3_GCTL_PRTCAP_DEVICE: - dwc3_gadget_process_pending_events(dwc); + if (dwc->pending_events) { + pm_runtime_put(dwc->dev); + dwc->pending_events = false; + enable_irq(dwc->irq_gadget); + } break; case DWC3_GCTL_PRTCAP_HOST: default: @@ -2574,6 +2578,14 @@ static void dwc3_complete(struct device *dev) static const struct dev_pm_ops dwc3_dev_pm_ops = { SET_SYSTEM_SLEEP_PM_OPS(dwc3_suspend, dwc3_resume) .complete = dwc3_complete, + + /* + * Current implementation for runtime suspend stops the controller on + * disconnection. It relies on platforms with custom connection + * interrupt to notify the driver to start the controller again. This + * flow is platform specific implemenation and not part of the + * controller's programming flow. + */ SET_RUNTIME_PM_OPS(dwc3_runtime_suspend, dwc3_runtime_resume, dwc3_runtime_idle) }; diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h index 1e561fd8b86e..2fa3fd55eb32 100644 --- a/drivers/usb/dwc3/core.h +++ b/drivers/usb/dwc3/core.h @@ -1673,7 +1673,6 @@ static inline void dwc3_otg_host_init(struct dwc3 *dwc) #if !IS_ENABLED(CONFIG_USB_DWC3_HOST) int dwc3_gadget_suspend(struct dwc3 *dwc); int dwc3_gadget_resume(struct dwc3 *dwc); -void dwc3_gadget_process_pending_events(struct dwc3 *dwc); #else static inline int dwc3_gadget_suspend(struct dwc3 *dwc) { @@ -1684,10 +1683,6 @@ static inline int dwc3_gadget_resume(struct dwc3 *dwc) { return 0; } - -static inline void dwc3_gadget_process_pending_events(struct dwc3 *dwc) -{ -} #endif /* !IS_ENABLED(CONFIG_USB_DWC3_HOST) */ #if IS_ENABLED(CONFIG_USB_DWC3_ULPI) diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 89fc690fdf34..ea97e9c1d1bd 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4735,14 +4735,3 @@ int dwc3_gadget_resume(struct dwc3 *dwc) return dwc3_gadget_soft_connect(dwc); } - -void dwc3_gadget_process_pending_events(struct dwc3 *dwc) -{ - if (dwc->pending_events) { - dwc3_interrupt(dwc->irq_gadget, dwc->ev_buf); - dwc3_thread_interrupt(dwc->irq_gadget, dwc->ev_buf); - pm_runtime_put(dwc->dev); - dwc->pending_events = false; - enable_irq(dwc->irq_gadget); - } -} -- BR, Thinh