On Fri, Sep 13, 2024, Thinh Nguyen wrote: > On Fri, Sep 13, 2024, Selvarasu Ganesan wrote: > > Hi Thinh, > > > > So far, there have been no reported error instances. But, we suspecting > > that the issue may be related to our glue driver. In our glue driver, we > > access the reference of evt->flags when starting or stopping the gadget > > based on a VBUS notification. We apologize for sharing this information > > so late, as we only became aware of it recently. > > > > The following sequence outlines the possible scenarios of race conditions: > > > > Thread#1 (Our glue Driver Sequence) > > =================================== > > ->USB VBUS notification > > ->Start/Stop gadget > > ->dwc->ev_buf->flags |= BIT(20); (It's for our reference) > > ->Call dwc3 exynos runtime suspend/resume > > ->dwc->ev_buf->flags &= ~BIT(20); > > ->Call dwc3 core runtime suspend/resume > > > > Thread#2 > > ======== > > ->dwc3_interrupt() > > ->evt->flags |= DWC3_EVENT_PENDING; > > ->dwc3_thread_interrupt() > > ->evt->flags &= ~DWC3_EVENT_PENDING; > > > > This is great! That's likely the problem. Glad you found it. > > > > > > > After our internal discussions, we have decided to remove the > > unnecessary access to evt->flag in our glue driver. We have made these > > changes and initiated testing. > > > > Thank you for your help so far to understand more into our glue driver code. > > > > And We are thinking that it would be fine to reset evt->flag when the > > USB controller is started, along with the changes you suggested earlier. > > This additional measure will help prevent similar issues from occurring > > in the future. > > > > Please let us know your thoughts on this proposal. If it is not > > necessary, we understand and will proceed accordingly. > > > > You can submit the change I suggested. That's a valid change. However, > we should not include the reset of the DWC3_EVENT_PENDING flag. Had we > done this, you may not found the issue above. It serves no purpose for > the core driver logic and will be an extra burden for us to maintain. (I > don't want to scratch my head in the future to figure out why that > change was needed or concern whether it can be removed without causing > regression). > Also, perhaps you may want to revisit and review the change below to see if the glue driver may be the culprit: 14e497183df2 ("usb: dwc3: core: Prevent USB core invalid event buffer address access") Thanks, Thinh