Re: Disconnect race in Gadget core

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 21-05-10 15:38:49, Alan Stern wrote:
> On Mon, May 10, 2021 at 07:43:00PM +0300, Felipe Balbi wrote:
> > 
> > Hi Alan,
> > 
> > Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> writes:
> > > I just noticed this potential race in the Gadget core, specifically, in 
> > > usb_gadget_remove_driver.  Here's the relevant code:
> > >
> > > 	usb_gadget_disconnect(udc->gadget);
> > > 	if (udc->gadget->irq)
> > > 		synchronize_irq(udc->gadget->irq);
> > > 	udc->driver->unbind(udc->gadget);
> > > 	usb_gadget_udc_stop(udc);
> > >
> > > usb_gadget_disconnect will turn off the D+ pull-up, telling the host (or 
> > > upstream hub) that the gadget is no longer connected to the bus.  The 
> > > udc->driver->unbind call then tells the gadget driver that it will no 
> > > longer receive any callbacks from the UDC driver or the gadget core.
> > >
> > > Now suppose that at just this moment, the user unplugs the USB cable.  
> > > The UDC driver will notice that the Vbus voltage has gone away and will 
> > > invoke the gadget driver's ->disconnect callback.  After all, it doesn't 
> > > realize it should avoid making these callbacks until usb_gadget_udc_stop 
> > > has run.
> > >
> > > As a result, the gadget driver's disconnect callback may be invoked 
> > > _after_ the driver has been unbound from the gadget.
> > 
> > argh, nice catch!
> > 
> > > How should we fix this potential problem?
> > 
> > I think we can just move usb_gadget_udc_stop(). Looking at a few udcs,
> > they're just either freeing or masking UDC's interrupts which should
> > prevent the race you describe while also making sure that no further
> > interrupts will trigger.
> > 
> > Perhaps we could move udc_stop() before synchronize_irq(). Do you
> > foresee any issues with that for net2272 or dummy?
> 
> I don't think it will be that easy.  As you may remember, there have 
> been a number of commits over the years fiddling with the order of 
> operations during gadget driver unbinding (although I don't think any of 
> them moved udc_stop from its position at the end).  Still, it's a 
> delicate matter.
> 
> The purpose of synchronize_irq is to wait until any currently running 
> IRQ handlers have finished.  Obviously this doesn't do any good unless 
> the software has arranged beforehand that no new interrupt requests will 
> be generated.  In this case, turning off the pull-up is currently not 
> sufficient.  But waiting until after udc_stop has returned isn't the 
> answer; it wouldn't prevent callbacks from being issued between the 
> unbind and the udc_stop.
> 
> And I'm quite sure that calling udc_stop before unbind would be wrong.  
> The gadget driver would then get various errors (like requests 
> completing with -ESHUTDOWN status) it's not prepared to handle.
> 
> So what we need is a way to tell the UDC driver to stop issuing 
> callbacks.  The ones defined in struct usb_gadget_driver are:
> 
> 	bind and unbind,
> 	bus suspend and bus resume,
> 	setup,
> 	bus reset,
> 	disconnect.
> 
> Bind and unbind aren't relevant for this discussion because they are 
> synchronous, not invoked in response to interrupts.
> 
> In theory there shouldn't be any bus-resume, setup, or bus-reset 
> interrupts once the pull-up is turned off, but it would be good to 
> protect against bad hardware which may produce them.
> 
> Bus suspend is a real possibility.  It occurs when the UDC hasn't 
> received any packets for a few milliseconds, which certainly will be the 
> case when the pull-up is off.  UDC hardware and drivers may or may not 
> be smart enough to realize that under this circumstance, lack of packets 
> shouldn't be reported as a bus suspend.
> 
> And of course, a cable disconnect can occur at any time -- nothing we 
> can do about that.
> 
> Putting it all together, we need to tell the UDC driver, somewhere 
> between turning the pull-up off and doing the synchronize_irq, to stop 
> issuing disconnect (and possibly also setup and suspend) callbacks.  I 
> don't think we can use the pull-up call for this purpose; a gadget 
> driver may want for some reason to disconnect logically from the bus 
> while still knowing whether Vbus is present.  (You may disagree about 
> that, but otherwise what's the point of having a disconnect callback in 
> the first place?)
> 
> Which means in the end that struct usb_gadget_ops needs either to have a 
> new callback for this purpose or to have an existing callback augmented 
> (for example, the ->pullup callback could get an additional argument 
> saying whether to continue issuing callbacks).  Or another possibility 
> is to make UDC drivers call a core routine to do disconnect callbacks 
> instead of issuing those callbacks themselves, and have the core filter 
> out callbacks that come at the wrong time.  Either way, every UDC driver 
> would need to be modified.
> 
> What do you think?  Is this reasoning accurate, or have I missed 
> something?
> 

Hi Alan,

I fixed a similar issue for configfs, see 1a1c851bbd70
("usb: gadget: configfs: fix concurrent issue between composite APIs")
It doesn't prevent disconnect callback, the disconnect callback will check
if unbind has called. The same for .setup and .suspend. Did you see
issues using configfs or legacy gadget? For legacy gadget, just like you said
it is the second disconnect callback is called during the removal process,
the first is called at usb_gadget_disconnect. It is not easy to prevent disconnect
occurring, we could add some logic at composite_disconnect, and let it quit if it is
called the second time.

It is hard to avoid usb_gadget_driver callback until usb_gadget_udc_stop has called,
no matter bad hardware or threaded interrupts, my former solution is avoid
dereferenced pointer issue, most of callbacks handling are useless if the gadget has already
unbind, the only meaningful callback is disconnect, and we have already called it
at usb_gadget_disconnect

-- 

Thanks,
Peter Chen




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux