On Tue, Sep 10, 2024, Selvarasu Ganesan wrote: > > On 9/7/2024 6:09 AM, Thinh Nguyen wrote: > > On Sat, Sep 07, 2024, Selvarasu Ganesan wrote: > >> Hi Thinh, > >> > >> I ran the code you recommended on our testing environment and was able > >> to reproduce the issue one time. > >> > >> When evt->flags contains DWC3_EVENT_PENDING, I've included the following > >> debugging information: I added this debug message at the start of > >> dwc3_event_buffers_cleanup and dwc3_event_buffers_setup functions in > >> during suspend and resume. > >> > >> The results were quite interesting . I'm curious to understand how > >> evt->flags is set to DWC3_EVENT_PENDING, and along with DWC3_GEVNTSIZ is > >> equal to 0x1000 during the suspend. > > That is indeed strange. > > > >> Its means that the previous bottom-half handler prior to suspend might > >> still be executing in the middle of the process. > >> > >> Could you please give your suggestions here? And let me know if anything > >> want to test or additional details are required. > >> > >> > >> ##DBG: dwc3_event_buffers_cleanup: > >> evt->length :0x1000 > >> evt->lpos :0x20c > >> evt->count :0x0 > >> evt->flags :0x1 // This is Unexpected if DWC3_GEVNTSIZ(0)(0xc408): > >> 0x00001000. Its means that previous bottom-half handler may be still > >> running in middle > > Perhaps. > > > > But I doubt that's the case since it shouldn't take that long for the > > bottom-half to be completed before the next resume yet the flag is still > > set. > > > >> DWC3_GEVNTSIZ(0)(0xc408) : 0x00001000 > >> DWC3_GEVNTCOUNT(0)(0xc40c) : 0x00000000 > >> DWC3_DCFG(0xc700) : 0x00e008a8 > >> DWC3_DCTL(0xc704) : 0x0cf00a00 > >> DWC3_DEVTEN(0xc708) : 0x00000000 > >> DWC3_DSTS(0xc70c) : 0x00d20cd1 > >> > > The controller status is halted. So there's no problem with > > soft-disconnect. For the interrupt mask in GEVNTSIZ to be cleared, > > that likely means that the bottom-half had probably completed. > > Agree, But I am worrying on If the bottom-half is completed, then > DWC3_EVENT_PENDING must be cleared in evt->flags. > Is there any possibility of a CPU reordering issue when updating > evt->flags in the bottom-half handler?. > Should I try with wmb() when writing to evt->flags? Assuming that the problem occurs after the bottom-half completed, there should be implicit memory barrier. The memory operation should complete before the release from spin_unlock complete. I don't think wmb() will help. > > > >> ##DBG: dwc3_event_buffers_setup: > >> evt->length :0x1000 > >> evt->lpos :0x20c > > They fact that evt->lpos did not get updated tells me that there's > > something wrong with memory access to your platform during suspend and > > resume. > > Are you expecting the evt->lpos value to be zero here? If so, this is > expected in our test setup because we avoid writing zero to evt->lpos as > part of dwc3_event_buffers_cleanup if evt->flags have a value of 1. This Oh ok. I did not know you made this modification. > is simply to track the status of evt->lpos during suspend to resume when > evt->flags have a value of DWC3_EVENT_PENDING. The following test codes > for the reference. > > --- a/drivers/usb/dwc3/core.c > +++ b/drivers/usb/dwc3/core.c > @@ -505,8 +505,20 @@ static int dwc3_alloc_event_buffers(struct dwc3 > *dwc, unsigned int length) > int dwc3_event_buffers_setup(struct dwc3 *dwc) > { > struct dwc3_event_buffer *evt; > + u32 reg; > > evt = dwc->ev_buf; > + > + if (evt->flags & DWC3_EVENT_PENDING) { > + pr_info("evt->length :%x\n", evt->length); > + pr_info("evt->lpos :%x\n", evt->lpos); > + pr_info("evt->count :%x\n", evt->count); > + pr_info("evt->flags :%x\n", evt->flags); > + > + dwc3_exynos_reg_dump(dwc); > + > + } > + > evt->lpos = 0; > dwc3_writel(dwc->regs, DWC3_GEVNTADRLO(0), > lower_32_bits(evt->dma)); > @@ -514,8 +526,10 @@ int dwc3_event_buffers_setup(struct dwc3 *dwc) > upper_32_bits(evt->dma)); > dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), > DWC3_GEVNTSIZ_SIZE(evt->length)); > - dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), 0); > > + /* Clear any stale event */ > + reg = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0)); > + dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), reg); > return 0; > } > > @@ -525,7 +539,16 @@ void dwc3_event_buffers_cleanup(struct dwc3 *dwc) > > evt = dwc->ev_buf; > > - evt->lpos = 0; > + if (evt->flags & DWC3_EVENT_PENDING) { > + pr_info("evt->length :%x\n", evt->length); > + pr_info("evt->lpos :%x\n", evt->lpos); > + pr_info("evt->count :%x\n", evt->count); > + pr_info("evt->flags :%x\n", evt->flags); > + > + dwc3_exynos_reg_dump(dwc); > + } else { > + evt->lpos = 0; I wasn't aware of this change. > + } > > > > >> evt->count :0x0 > >> evt->flags :0x1 // Still It's not clearing in during resume. > >> > >> DWC3_GEVNTSIZ(0)(0xc408) : 0x00000000 > >> DWC3_GEVNTCOUNT(0)(0xc40c) : 0x00000000 > >> DWC3_DCFG(0xc700) : 0x00080800 > >> DWC3_DCTL(0xc704) : 0x00f00000 > >> DWC3_DEVTEN(0xc708) : 0x00000000 > >> DWC3_DSTS(0xc70c) : 0x00d20001 > >> > > Please help look into your platform to see what condition triggers this > > memory access issue. If this is a hardware quirk, we can properly update > > the change and note it to be so. > > Sure I will try to figure it out. However, we are facing challenges in > reproducing the issue. There could be a delay in understanding the > conditions that trigger the memory issue if it is related to a memory issue. > > > > > Thanks, > > Thinh > > > > (If possible, for future tests, please dump the dwc3 tracepoints. Many > > thanks for the tests.) > > I tried to get dwc3 traces in the failure case, but so far no instances > have been reported. Our testing is still in progress with enable dwc3 > traces. > > I will keep posting once I get the dwc3 traces in the failure scenario. > Thanks for the update. I hope enabling of the driver tracepoints will not impact the reproduction of the issue. With the driver log, we'll get more clues to what was going on. Thanks, Thinh