On 9/7/2024 6:09 AM, Thinh Nguyen wrote: > On Sat, Sep 07, 2024, Selvarasu Ganesan wrote: >> Hi Thinh, >> >> I ran the code you recommended on our testing environment and was able >> to reproduce the issue one time. >> >> When evt->flags contains DWC3_EVENT_PENDING, I've included the following >> debugging information: I added this debug message at the start of >> dwc3_event_buffers_cleanup and dwc3_event_buffers_setup functions in >> during suspend and resume. >> >> The results were quite interesting . I'm curious to understand how >> evt->flags is set to DWC3_EVENT_PENDING, and along with DWC3_GEVNTSIZ is >> equal to 0x1000 during the suspend. > That is indeed strange. > >> Its means that the previous bottom-half handler prior to suspend might >> still be executing in the middle of the process. >> >> Could you please give your suggestions here? And let me know if anything >> want to test or additional details are required. >> >> >> ##DBG: dwc3_event_buffers_cleanup: >> evt->length :0x1000 >> evt->lpos :0x20c >> evt->count :0x0 >> evt->flags :0x1 // This is Unexpected if DWC3_GEVNTSIZ(0)(0xc408): >> 0x00001000. Its means that previous bottom-half handler may be still >> running in middle > Perhaps. > > But I doubt that's the case since it shouldn't take that long for the > bottom-half to be completed before the next resume yet the flag is still > set. > >> DWC3_GEVNTSIZ(0)(0xc408) : 0x00001000 >> DWC3_GEVNTCOUNT(0)(0xc40c) : 0x00000000 >> DWC3_DCFG(0xc700) : 0x00e008a8 >> DWC3_DCTL(0xc704) : 0x0cf00a00 >> DWC3_DEVTEN(0xc708) : 0x00000000 >> DWC3_DSTS(0xc70c) : 0x00d20cd1 >> > The controller status is halted. So there's no problem with > soft-disconnect. For the interrupt mask in GEVNTSIZ to be cleared, > that likely means that the bottom-half had probably completed. Agree, But I am worrying on If the bottom-half is completed, then DWC3_EVENT_PENDING must be cleared in evt->flags. Is there any possibility of a CPU reordering issue when updating evt->flags in the bottom-half handler?. Should I try with wmb() when writing to evt->flags? > >> ##DBG: dwc3_event_buffers_setup: >> evt->length :0x1000 >> evt->lpos :0x20c > They fact that evt->lpos did not get updated tells me that there's > something wrong with memory access to your platform during suspend and > resume. Are you expecting the evt->lpos value to be zero here? If so, this is expected in our test setup because we avoid writing zero to evt->lpos as part of dwc3_event_buffers_cleanup if evt->flags have a value of 1. This is simply to track the status of evt->lpos during suspend to resume when evt->flags have a value of DWC3_EVENT_PENDING. The following test codes for the reference. --- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -505,8 +505,20 @@ static int dwc3_alloc_event_buffers(struct dwc3 *dwc, unsigned int length) int dwc3_event_buffers_setup(struct dwc3 *dwc) { struct dwc3_event_buffer *evt; + u32 reg; evt = dwc->ev_buf; + + if (evt->flags & DWC3_EVENT_PENDING) { + pr_info("evt->length :%x\n", evt->length); + pr_info("evt->lpos :%x\n", evt->lpos); + pr_info("evt->count :%x\n", evt->count); + pr_info("evt->flags :%x\n", evt->flags); + + dwc3_exynos_reg_dump(dwc); + + } + evt->lpos = 0; dwc3_writel(dwc->regs, DWC3_GEVNTADRLO(0), lower_32_bits(evt->dma)); @@ -514,8 +526,10 @@ int dwc3_event_buffers_setup(struct dwc3 *dwc) upper_32_bits(evt->dma)); dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), DWC3_GEVNTSIZ_SIZE(evt->length)); - dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), 0); + /* Clear any stale event */ + reg = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0)); + dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), reg); return 0; } @@ -525,7 +539,16 @@ void dwc3_event_buffers_cleanup(struct dwc3 *dwc) evt = dwc->ev_buf; - evt->lpos = 0; + if (evt->flags & DWC3_EVENT_PENDING) { + pr_info("evt->length :%x\n", evt->length); + pr_info("evt->lpos :%x\n", evt->lpos); + pr_info("evt->count :%x\n", evt->count); + pr_info("evt->flags :%x\n", evt->flags); + + dwc3_exynos_reg_dump(dwc); + } else { + evt->lpos = 0; + } > >> evt->count :0x0 >> evt->flags :0x1 // Still It's not clearing in during resume. >> >> DWC3_GEVNTSIZ(0)(0xc408) : 0x00000000 >> DWC3_GEVNTCOUNT(0)(0xc40c) : 0x00000000 >> DWC3_DCFG(0xc700) : 0x00080800 >> DWC3_DCTL(0xc704) : 0x00f00000 >> DWC3_DEVTEN(0xc708) : 0x00000000 >> DWC3_DSTS(0xc70c) : 0x00d20001 >> > Please help look into your platform to see what condition triggers this > memory access issue. If this is a hardware quirk, we can properly update > the change and note it to be so. Sure I will try to figure it out. However, we are facing challenges in reproducing the issue. There could be a delay in understanding the conditions that trigger the memory issue if it is related to a memory issue. > > Thanks, > Thinh > > (If possible, for future tests, please dump the dwc3 tracepoints. Many > thanks for the tests.) I tried to get dwc3 traces in the failure case, but so far no instances have been reported. Our testing is still in progress with enable dwc3 traces. I will keep posting once I get the dwc3 traces in the failure scenario. Thanks, Selva