Hi John, John Stultz wrote: > On Fri, Feb 1, 2019 at 4:18 PM John Stultz <john.stultz@xxxxxxxxxx> wrote: >> Hey all, >> Since the 5.0 merge window opened, I've been tripping on frequent >> dwc3 crashes on reboot and suspend, which I've added an example to the >> bottom of this mail. >> >> I've dug in a little bit and sort of have a sense of whats going on. >> >> In ffs_epfile_io(): >> https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_kernel_git_torvalds_linux.git_tree_drivers_usb_gadget_function_f-5Ffs.c-23n1065&d=DwIBaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=u9FYoxKtyhjrGFcyixFYqTjw1ZX0VsG2d8FCmzkTY-w&m=Ikgcuoe1TJkip3EVA2Cce33perU7WerY9a24BCFW4DM&s=3gJjzpAGPdj79ROPvlM1ziRTY-4u6VRFRwKWbz5X_SA&e= >> >> The completion done is setup on the stack: >> DECLARE_COMPLETION_ONSTACK(done); >> >> Then later we setup a request and queue it: >> req->context = &done; >> ... >> ret = usb_ep_queue(ep->ep, req, GFP_ATOMIC); >> >> Then wait for it: >> if (unlikely(wait_for_completion_interruptible(&done))) { >> /* >> * To avoid race condition with ffs_epfile_io_complete, >> * dequeue the request first then check >> * status. usb_ep_dequeue API should guarantee no race >> * condition with req->complete callback. >> */ >> usb_ep_dequeue(ep->ep, req); >> interrupted = ep->status < 0; >> } >> >> The problem is, that we end up being interrupted, supposedly dequeue >> the request, and exit. >> >> But then (or in parallel) the irq triggers and we try calling >> complete() on the context pointer which points to now random stack >> space, which results in the panic. >> >> It seems like something is wrong with usb_ep_dequeue not really >> stopping the irq from happening? >> >> If I revert all the changes to dwc3 back to 4.20, I don't see the issue. >> >> I'll do some bisection to try to narrow things down, but I wanted to >> see if this was a known issue or if anyone had immediate ideas as to >> what might be wrong. > Bisecting the changes down, it seems like its due to commit > fec9095bdef4e ("usb: dwc3: gadget: remove wait_end_transfer"). > > It doesn't happen all the time, so I'll need to run some more testing, > but so far I've not been able to trigger it backing out the patches to > that point. > > thanks > -john > Yeah, it sounds like the same issue. You can review the discussion here: https://www.spinics.net/lists/linux-usb/msg176110.html Thinh