Hi Thinh, On Mon, Oct 17, 2022 at 09:30:38PM +0000, Thinh Nguyen wrote: > On Mon, Oct 17, 2022, Dan Vacura wrote: > > From: Jeff Vanhoof <qjv001@xxxxxxxxxxxx> > > > > arm-smmu related crashes seen after a Missed ISOC interrupt when > > no_interrupt=1 is used. This can happen if the hardware is still using > > the data associated with a TRB after the usb_request's ->complete call > > has been made. Instead of immediately releasing a request when a Missed > > ISOC interrupt has occurred, this change will add logic to cancel the > > request instead where it will eventually be released when the > > END_TRANSFER command has completed. This logic is similar to some of the > > cleanup done in dwc3_gadget_ep_dequeue. > > This doesn't sound right. How did you determine that the hardware is > still using the data associated with the TRB? Did you check the TRB's > HWO bit? The problem we're seeing was mentioned in the summary of this patch series, issue #1. Basically, with the following patch https://patchwork.kernel.org/project/linux-usb/patch/20210628155311.16762-6-m.grzeschik@xxxxxxxxxxxxxx/ integrated a smmu panic is occurring on our Android device with the 5.15 kernel which is: <3>[ 718.314900][ T803] arm-smmu 15000000.apps-smmu: Unhandled arm-smmu context fault from a600000.dwc3! The uvc gadget driver appears to be the first (and only) gadget that uses the no_interrupt=1 logic, so this seems to be a new condition for the dwc3 driver. In our configuration, we have up to 64 requests and the no_interrupt=1 for up to 15 requests. The list size of dep->started_list would get up to that amount when looping through to cleanup the completed requests. From testing and debugging the smmu panic occurs when a -EXDEV status shows up and right after dwc3_gadget_ep_cleanup_completed_request() was visited. The conclusion we had was the requests were getting returned to the gadget too early. > > The dwc3 driver would only give back the requests if the TRBs of the > associated requests are completed or when the device is disconnected. > If the TRB indicated missed isoc, that means that the TRB is completed > and its status was updated. Interesting, the device is not disconnected as we don't get the -ESHUTDOWN status back and with this patch in place things continue after a -EXDEV status is received. > > There's a special case which dwc3 may give back requests early is the > case of the device disconnecting. The requests should be returned with > -ESHUTDOWN, and the gadget driver shouldn't be re-using the requests on > de-initialization anyway. > > We should not issue End Transfer command just because of missed isoc. We > may want issue End Transfer if the gadget driver is too slow and unable > to feed requests in time (causing underrun and missed isoc) to resync > with the host, but we already handle that. Hmm, isn't that what happens when we get into this condition in dwc3_gadget_endpoint_trbs_complete(): if (usb_endpoint_xfer_isoc(dep->endpoint.desc) && list_empty(&dep->started_list) && (list_empty(&dep->pending_list) || status == -EXDEV)) dwc3_stop_active_transfer(dep, true, true); > > I'm still not clear what's the problem you're seeing. Do you have the > crash log? Tracepoints? > > BR, > Thinh Appreciate the support! Dan