On 8/8/2024 6:45 AM, Thinh Nguyen wrote: > On Wed, Aug 07, 2024, Selvarasu Ganesan wrote: >> On 8/7/2024 6:08 AM, Thinh Nguyen wrote: >>> On Fri, Jul 19, 2024, Selvarasu Ganesan wrote: >>>> In certain scenarios, there is a chance that the CPU may not be >>>> scheduled the bottom half of dwc3 interrupt. This is because the CPU >>>> may hang up where any work queue lockup has happened for the same CPU >>>> that is trying to schedule the dwc3 thread interrupt. In this scenario, >>>> the USB can enter runtime suspend as the bus may idle for a longer time >>>> , or user can reconnect the USB cable. Then, the dwc3 event interrupt >>>> can be enabled when runtime resume is happening with regardless of the >>>> previous event status. This can lead to a dwc3 IRQ storm due to the >>>> return from the interrupt handler by checking only the evt->flags as >>>> DWC3_EVENT_PENDING, where the same flag was set as DWC3_EVENT_PENDING >>>> in previous work queue lockup. >>>> Let's consider the following sequences in this scenario, >>>> >>>> Call trace of dwc3 IRQ after workqueue lockup scenario >>>> ====================================================== >>>> IRQ #1: >>>> ->dwc3_interrupt() >>>> ->dwc3_check_event_buf() >>>> ->if (evt->flags & DWC3_EVENT_PENDING) >>>> return IRQ_HANDLED; >>>> ->evt->flags |= DWC3_EVENT_PENDING; >>>> ->/* Disable interrupt by setting DWC3_GEVNTSIZ_INTMASK in >>>> DWC3_GEVNTSIZ >>>> ->return IRQ_WAKE_THREAD; // No workqueue scheduled for dwc3 >>>> thread_fu due to workqueue lockup >>>> even after return IRQ_WAKE_THREAD >>>> from top-half. >>>> >>>> Thread #2: >>>> ->dwc3_runtime_resume() >>>> ->dwc3_resume_common() >>>> ->dwc3_gadget_resume() >>>> ->dwc3_gadget_soft_connect() >>>> ->dwc3_event_buffers_setup() >>>> ->/*Enable interrupt by clearing DWC3_GEVNTSIZ_INTMASK in >>>> DWC3_GEVNTSIZ*/ >>>> >>>> Start IRQ Storming after enable dwc3 event in resume path >>>> ========================================================= >>>> CPU0: IRQ >>>> dwc3_interrupt() >>>> dwc3_check_event_buf() >>>> if (evt->flags & DWC3_EVENT_PENDING) >>>> return IRQ_HANDLED; >>>> >>>> CPU0: IRQ >>>> dwc3_interrupt() >>>> dwc3_check_event_buf() >>>> if (evt->flags & DWC3_EVENT_PENDING) >>>> return IRQ_HANDLED; >>>> .. >>>> .. >>>> >>>> To fix this issue by avoiding enabling of the dwc3 event interrupt in >>>> the runtime resume path if dwc3 event processing is in progress. >>>> >>>> Signed-off-by: Selvarasu Ganesan <selvarasu.g@xxxxxxxxxxx> >>>> --- >>>> drivers/usb/dwc3/core.c | 8 ++++++-- >>>> 1 file changed, 6 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c >>>> index cb82557678dd..610792a70805 100644 >>>> --- a/drivers/usb/dwc3/core.c >>>> +++ b/drivers/usb/dwc3/core.c >>>> @@ -549,8 +549,12 @@ int dwc3_event_buffers_setup(struct dwc3 *dwc) >>>> lower_32_bits(evt->dma)); >>>> dwc3_writel(dwc->regs, DWC3_GEVNTADRHI(0), >>>> upper_32_bits(evt->dma)); >>>> - dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), >>>> - DWC3_GEVNTSIZ_SIZE(evt->length)); >>>> + >>>> + /* Skip enable dwc3 event interrupt if event is processing in middle */ >>>> + if (!(evt->flags & DWC3_EVENT_PENDING)) >>>> + dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), >>>> + DWC3_GEVNTSIZ_SIZE(evt->length)); >>>> + >>>> dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), 0); >>>> >>>> return 0; >>>> -- >>>> 2.17.1 >>>> >>> We're not waking up from a hibernation. So after a soft-reset and >>> resume, the events that weren't processed are stale. They should be >>> processed prior to entering suspend or be discarded before resume. >>> >>> The synchronize_irq() during suspend() was not sufficient to prevent >>> this? What are we missing here. >>> >>> Thanks, >>> Thinh >> I don’t think the triggering of interrupt would not be stopped even if >> do soft reset. It's because of event count is may be valid . > Ok. I think I see what you're referring to when you say "event is > processing in the middle" now. > > What you want to check is probably this in dwc3_event_buffers_setup(). > Please confirm: > > if (dwc->pending_events) > dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), > DWC3_GEVNTSIZ_INTMASK | DWC3_GEVNTSIZ_SIZE(evt->length)); > else > dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), DWC3_GEVNTSIZ_SIZE(evt->length)); Yes, we are expecting the same. But, we must verify the status of evt->flags, which will indicate whether the event is currently processing in middle or not. The below code is for the reference. if (!(evt->flags & DWC3_EVENT_PENDING)) dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), DWC3_GEVNTSIZ_SIZE(evt->length)); else dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), DWC3_GEVNTSIZ_INTMASK | DWC3_GEVNTSIZ_SIZE(evt->length)); >> Let consider the scenarios where SW is not acknowledge the event count >> after getting a interrupt with disable GEVNTSIZ_INTMASK =0. It will >> triggering the spurious interrupts until enable GEVNTSIZ_INTMASK=1 or >> acknowledge the event count by SW. This is happening here because of We >> just return from interrupt handler by checking if evt->flags as >> DWC3_EVENT_PENDING. Clearing of DWC3_EVENT_PENDING flag is done in >> dwc3_thread_interrupt. In some scenario it's not happening due to cpu >> hangup or work queue lockup. > This can be mitigated by adjusting the imod_interval (interrupt > moderation). Have you tried that? Yes we tried to play around the changing of imod interval value but no improvements. > > Thanks, > Thinh > >> The same issue was reported earlier and not derived actual root cause >> from USB dwc3 driver point of view, and somehow we managing fix with our >> vendor kernel by using kretprobe. >> >> To fix this issue, We have to reset the evt->flags or stop disable >> GEVNTSIZ_INTMASK=0 >> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20230102050831.105499-1-jh0801.jung@xxxxxxxxxxx/__;!!A4F2R9G_pg!b0Uj7NvYzWqrvNKOrkkXaY-g4Uaso-pW520AY2fBdlqmJeudUnmdsgpnL1YxAZaw2ydyg2M-XzhetbCraKxJa5C3NBQ$ >> https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/1646630679-121585-1-git-send-email-jh0801.jung@xxxxxxxxxxx/__;!!A4F2R9G_pg!b0Uj7NvYzWqrvNKOrkkXaY-g4Uaso-pW520AY2fBdlqmJeudUnmdsgpnL1YxAZaw2ydyg2M-XzhetbCraKxJ2d2qDqA$ >> > Thanks, > Thinh