On Tue, Apr 13, 2021 at 6:13 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > On Tue, Apr 13, 2021 at 10:12:05AM +0200, Dmitry Vyukov wrote: > > On Tue, Apr 13, 2021 at 10:08 AM syzbot > > <syzbot+eb4674092e6cc8d9e0bd@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > Hello, > > > > > > syzbot found the following issue on: > > > > > > HEAD commit: 0f4498ce Merge tag 'for-5.12/dm-fixes-2' of git://git.kern.. > > > git tree: upstream > > > console output: https://syzkaller.appspot.com/x/log.txt?x=124adbf6d00000 > > > kernel config: https://syzkaller.appspot.com/x/.config?x=daeff30c2474a60f > > > dashboard link: https://syzkaller.appspot.com/bug?extid=eb4674092e6cc8d9e0bd > > > userspace arch: i386 > > > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > Reported-by: syzbot+eb4674092e6cc8d9e0bd@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > I suspect that the raw gadget_unbind() can be called while the timer > > is still active. gadget_unbind() sets gadget data to NULL. > > But I am not sure which unbind call this is: > > usb_gadget_remove_driver() or right in udc_bind_to_driver() due to a > > start error. > > This certainly looks like a race between gadget_unbind and gadget_setup > in raw_gadget. > > In theory, this race shouldn't matter. The gadget core is supposed to > guarantee that there won't be any more callbacks to the gadget driver > once the driver's unbind routine is called. That guarantee is enforced > in usb_gadget_remove_driver as follows: > > usb_gadget_disconnect(udc->gadget); > if (udc->gadget->irq) > synchronize_irq(udc->gadget->irq); > udc->driver->unbind(udc->gadget); > usb_gadget_udc_stop(udc); > > usb_gadget_disconnect turns off the pullup resistor, telling the host > that the gadget is no longer connected and preventing the transmission > of any more USB packets. Any packets that have already been received > are sure to processed by the UDC driver's interrupt handler by the time > synchronize_irq returns. > > But this doesn't work with dummy_hcd, because dummy_hcd doesn't use > interrupts; it uses a timer instead. It does have code to emulate the > effect of synchronize_irq, but that code doesn't get invoked at the > right time -- it currently runs in usb_gadget_udc_stop, after the unbind > callback instead of before. Indeed, there's no way for > usb_gadget_remove_driver to invoke this code before the unbind > callback,. > > I thought the synchronize_irq emulation problem had been completely > solved, but evidently it hasn't. It looks like the best solution is to > add a call of the synchronize_irq emulation code in dummy_pullup. > > Maybe we can test this reasoning by putting a delay just before the call > to dum->driver->setup. That runs in the timer handler, so it's not a > good place to delay, but it may be okay just for testing purposes. > > Hopefully this patch will make the race a lot more likely to occur. Is > there any way to tell syzkaller to test it, despite the fact that > syzkaller doesn't think it has a reproducer for this issue? If there is no reproducer the only way syzbot can test it is if it's in linux-next under CONFIG_DEBUG_AID_FOR_SYZBOT: http://bit.do/syzbot#no-custom-patches > Alan Stern > > > Index: usb-devel/drivers/usb/gadget/udc/dummy_hcd.c > =================================================================== > --- usb-devel.orig/drivers/usb/gadget/udc/dummy_hcd.c > +++ usb-devel/drivers/usb/gadget/udc/dummy_hcd.c > @@ -1900,6 +1900,7 @@ restart: > if (value > 0) { > ++dum->callback_usage; > spin_unlock(&dum->lock); > + mdelay(5); > value = dum->driver->setup(&dum->gadget, > &setup); > spin_lock(&dum->lock);