On Fri, Nov 17, 2023, Thinh Nguyen wrote: > Hi, > > On Thu, Nov 16, 2023, Dan Scally wrote: > > CC Thinh - sorry to bother you, just want to make sure we fix this in the right place. > > > > On 08/11/2023 11:48, Kuen-Han Tsai wrote: > > > On 02/11/2023 07:11, Piyush Mehta wrote: > > > > There could be chances where the usb_ep_queue() could fail and trigger > > > > complete() handler with error status. In this case, if usb_ep_queue() > > > > is called with lock held and the triggered complete() handler is waiting > > > > for the same lock to be cleared could result in a deadlock situation and > > > > could result in system hang. To aviod this scenerio, call usb_ep_queue() > > > > with lock removed. This patch does the same. > > > I would like to provide more background information on this problem. > > > > > > We met a deadlock issue on Android devices and the followings are stack traces. > > > > > > [35845.978435][T18021] Core - Debugging Information for Hardlockup core(8) - locked CPUs mask (0x100) > > > [35845.978442][T18021] Call trace: > > > [*][T18021] queued_spin_lock_slowpath+0x84/0x388 > > > [35845.978451][T18021] uvc_video_complete+0x180/0x24c > > > [35845.978458][T18021] usb_gadget_giveback_request+0x38/0x14c > > > [35845.978464][T18021] dwc3_gadget_giveback+0xe4/0x218 > > > [35845.978469][T18021] dwc3_gadget_ep_cleanup_cancelled_requests+0xc8/0x108 > > > [35845.978474][T18021] __dwc3_gadget_kick_transfer+0x34c/0x368 > > > [35845.978479][T18021] __dwc3_gadget_start_isoc+0x13c/0x3b8 > > > [35845.978483][T18021] dwc3_gadget_ep_queue+0x150/0x2f0 > > > [35845.978488][T18021] usb_ep_queue+0x58/0x16c > > > [35845.978493][T18021] uvcg_video_pump+0x22c/0x518 > > > > > > I note in the kerneldoc comment for usb_ep_queue() that calling .complete() > > from within itself is specifically disallowed [1]: > > > > Note that @req's ->complete() callback must never be called from > > > > within usb_ep_queue() as that can create deadlock situations. > > > > > > And it looks like that's what's happening here - is this something that > > needs addressing in the dwc3 driver? > > > > Looks like it. The issue is in dwc3. It should only affect isoc request > queuing. > > Can we try with this patch: > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > index 858fe4c299b7..37e08eed49d9 100644 > --- a/drivers/usb/dwc3/gadget.c > +++ b/drivers/usb/dwc3/gadget.c > @@ -1684,12 +1684,15 @@ static int __dwc3_gadget_kick_transfer(struct dwc3_ep *dep) > dwc3_gadget_move_cancelled_request(req, DWC3_REQUEST_STATUS_DEQUEUED); > > /* If ep isn't started, then there's no end transfer pending */ > - if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING)) > + if (!(dep->flags & DWC3_EP_PENDING_REQUEST) && > + !(dep->flags & DWC3_EP_END_TRANSFER_PENDING)) > dwc3_gadget_ep_cleanup_cancelled_requests(dep); > > return ret; > } > > + dep->flags &= ~DWC3_EP_PENDING_REQUEST; > + > if (dep->stream_capable && req->request.is_last && > !DWC3_MST_CAPABLE(&dep->dwc->hwparams)) > dep->flags |= DWC3_EP_WAIT_TRANSFER_COMPLETE; > > --- > Actually, please ignore the above, that's not correct. I'll send out a proper patch later. Thanks, Thinh