On Tue, Apr 13, 2021 at 10:12:05AM +0200, Dmitry Vyukov wrote: > On Tue, Apr 13, 2021 at 10:08 AM syzbot > <syzbot+eb4674092e6cc8d9e0bd@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: 0f4498ce Merge tag 'for-5.12/dm-fixes-2' of git://git.kern.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=124adbf6d00000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=daeff30c2474a60f > > dashboard link: https://syzkaller.appspot.com/bug?extid=eb4674092e6cc8d9e0bd > > userspace arch: i386 > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+eb4674092e6cc8d9e0bd@xxxxxxxxxxxxxxxxxxxxxxxxx > > I suspect that the raw gadget_unbind() can be called while the timer > is still active. gadget_unbind() sets gadget data to NULL. > But I am not sure which unbind call this is: > usb_gadget_remove_driver() or right in udc_bind_to_driver() due to a > start error. This certainly looks like a race between gadget_unbind and gadget_setup in raw_gadget. In theory, this race shouldn't matter. The gadget core is supposed to guarantee that there won't be any more callbacks to the gadget driver once the driver's unbind routine is called. That guarantee is enforced in usb_gadget_remove_driver as follows: usb_gadget_disconnect(udc->gadget); if (udc->gadget->irq) synchronize_irq(udc->gadget->irq); udc->driver->unbind(udc->gadget); usb_gadget_udc_stop(udc); usb_gadget_disconnect turns off the pullup resistor, telling the host that the gadget is no longer connected and preventing the transmission of any more USB packets. Any packets that have already been received are sure to processed by the UDC driver's interrupt handler by the time synchronize_irq returns. But this doesn't work with dummy_hcd, because dummy_hcd doesn't use interrupts; it uses a timer instead. It does have code to emulate the effect of synchronize_irq, but that code doesn't get invoked at the right time -- it currently runs in usb_gadget_udc_stop, after the unbind callback instead of before. Indeed, there's no way for usb_gadget_remove_driver to invoke this code before the unbind callback,. I thought the synchronize_irq emulation problem had been completely solved, but evidently it hasn't. It looks like the best solution is to add a call of the synchronize_irq emulation code in dummy_pullup. Maybe we can test this reasoning by putting a delay just before the call to dum->driver->setup. That runs in the timer handler, so it's not a good place to delay, but it may be okay just for testing purposes. Hopefully this patch will make the race a lot more likely to occur. Is there any way to tell syzkaller to test it, despite the fact that syzkaller doesn't think it has a reproducer for this issue? Alan Stern Index: usb-devel/drivers/usb/gadget/udc/dummy_hcd.c =================================================================== --- usb-devel.orig/drivers/usb/gadget/udc/dummy_hcd.c +++ usb-devel/drivers/usb/gadget/udc/dummy_hcd.c @@ -1900,6 +1900,7 @@ restart: if (value > 0) { ++dum->callback_usage; spin_unlock(&dum->lock); + mdelay(5); value = dum->driver->setup(&dum->gadget, &setup); spin_lock(&dum->lock);