On Thu, Jul 07, 2022 at 06:47:09PM +0200, Jozo M. wrote: > But can't be this a deadlock ? > What if IRQ which calls complete(...) will be executed sooner than > wait_for_completion() It's okay. If complete() gets called first then wait_for_completion() will return immediately. Alan Stern > > št 7. 7. 2022 o 18:11 Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> napísal(a): > > > > On Thu, Jul 07, 2022 at 11:43:09AM +0200, Jozo M. wrote: > > > Hello, > > > > > > my kernel running on imx6 was crashing on USB gadgetfs because my > > > kernel was using wait_event API instead of completion (I was convinced > > > it is due to wrong HW setup). > > > During research gadgetfs inode.c function ep_io was not clear for me: > > > > > > we are submiting USB request here > > > value = usb_ep_queue (epdata->ep, req, GFP_ATOMIC); > > > then we are waiting for completion here: > > > value = wait_for_completion_interruptible(&done); > > > but if completion is interrupted we end up here: > > > spin_unlock_irq (&epdata->dev->lock); > > > DBG (epdata->dev, "endpoint gone\n"); > > > epdata->status = -ENODEV; > > > > You left out part of the code. We end up at this code in the case where > > epdata->ep == NULL, and the only way that can happen is if the endpoint > > was removed while we were waiting for the completion. > > > > > At this point ep_io is terminated and stack is not valid. Later on > > > epio_complete might be called from IRQ and it calls complete ((struct > > > completion *)req->context) but stack is no longer valid; > > > Shouldn't we put req->context = NULL; before spin_unlock_irq > > > (&epdata->dev->lock); ? > > > req->context = NULL; > > > spin_unlock_irq (&epdata->dev->lock); > > > DBG (epdata->dev, "endpoint gone\n"); > > > epdata->status = -ENODEV; > > > > You're right that this is a bug, but your solution is not correct. > > There is a race: epio_complete can run at the same time as this code if > > the endpoint is removed concurrently with the interrupt, and your > > approach is still subject to this race. > > > > The right way to fix the problem is to call wait_for_completion(&done) > > between the DBG and the assignment to epdata->status. That way the > > stack will still be valid when epio_complete runs. > > > > Please feel free to submit a patch doing this. > > > > Alan Stern