Re: Disconnect race in Gadget core

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> writes:
>> Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> writes:
>> > On Fri, May 14, 2021 at 10:35:59AM +0300, Felipe Balbi wrote:
>> >> Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> writes:
>> >> > On Wed, May 12, 2021 at 10:00:41AM +0300, Felipe Balbi wrote:
>> >> >> Hmm, IIRC only the storage gadget defers work to another thread.
>> >> >
>> >> > Well, the composite core is built into almost every gadget, and doesn't 
>> >> > it handle Set-Configuration and Set-Interface requests in a separate 
>> >> > thread?  Or doesn't it expect function drivers to do so?
>> >> 
>> >> composite.c doesn't defer anything to a separate thread by itself. The
>> >> gadget driver, if it finds it necessary, should handle it
>> >> internally. Most gadget drivers handle all of this in interrupt context.
>> >> 
>> >> > After all, the gadget first learns about config or altsetting changes 
>> >> > when it receives a Set-Configuration or Set-Interface request, which 
>> >> > happens in interrupt context.  But changing configs or altsettings 
>> >> > requires disabling/enabling endpoints, which needs a process context 
>> >> > (see the kerneldoc for usb_ep_enable and usb_ep_disable).
>> >> 
>> >> Ouch, I don't think any driver is guaranteeing that other than the
>> >> storage gadget.
>> >
>> > A quick search shows that USB_GADGET_DELAYED_STATUS is used in f_fs.c, 
>> > f_tcm.c, and f_uvc.c in addition to f_mass_storage.c.  So the situation 
>> > isn't as bad as you thought, although it should be better.
>> 
>> right, that's not what the documentation means, though. We're still
>> enabling/disabling endpoints in interrupt context, just delaying the
>> status stage until a later time.
>> 
>> It looks like delaying status like this is enough for the current UDC
>> drivers so, perhaps, we should make delayed_status mandatory and update
>> the documentation accordingly.
>
> If it's okay to call those functions in interrupt context then the 
> kerneldoc definitely should be updated.  However, I don't see why you 
> would want to make DELAYED_STATUS mandatory.  If all the necessary work 
> can be done in the set_alt handler, why not return the status 
> immediately?

because we avoid a special case. Instead of having magic return value to
mean "Don't do anything until I enqueue a request" we can just make that
an assumption, i.e. gadget driver *must* enqueue requests for data and
status stages.

> BTW, does it seem odd to you that these function drivers defer some of 
> the set_alt activities into a work thread but do the ep_enable/disable 
> parts right away, in interrupt context?  I should think the drivers 
> would want to minimize the amount of processing that happens 
> in_interrupt.

it is a bit odd, yes. I'm pretty sure this is forcing odd things to
happen in at least camera gadget, which must communicate with v4l2.

IIRC, camera gadget processes frames in the same context as the
->complete callback, as well, which also runs in_interrupt.

>> > Anyway, getting back to the main point...
>> >
>> > To minimize disruption, suppose we add a new callback to usb_gadget_ops:
>> >
>> > 	void	(*udc_async_callbacks)(struct usb_gadget *, int enable);
>> >
>> > The UDC core can call this at the appropriate times, if it is not NULL.  
>> > It allows the core to tell a UDC driver whether or not to issue 
>> > callbacks for setup, disconnect, reset, suspend, and resume.  It doesn't 
>> > affect request completion callbacks.
>> >
>> > So for removing a driver, the proper sequence will be:
>> >
>> > 	usb_gadget_disconnect()
>> > 	if (gadget->ops->udc_async_callbacks)
>> > 		gadget->ops->udc_async_callbacks(gadget, 0);
>> > 	synchronize_irq()
>> > 	udc->driver->unbind()
>> > 	usb_gadget_udc_stop()
>> >
>> > Or maybe the last two steps should be reversed.  In udc_bind_to_driver, 
>
> After some more thought, I decided the last two steps are in the right 
> order now.  When udc_stop runs, it causes the UDC driver to forget about 
> the gadget driver, so there wouldn't be any way to issue the completion 
> callbacks when the unbind handler cancels the outstanding requests and 
> disables the endpoints.
>
>> > the opposite sequence is:
>> >
>> > 	bind
>> > 	udc_start
>
> Then by analogy, these two steps should be reversed.  But it doesn't 
> really matter, because the gadget driver won't try to enable endpoints 
> or do anything else until the host has enumerated the gadget.  (And if 
> there are any gadget devices which don't allow software to control the 
> pull-up separately, then they clearly will want bind to precede 
> udc_start.)

makes sense

>> > 	enable async callbacks
>> > 	connect (turn on pull-up)
>> >
>> > How does this sound?
>> 
>> Sounds reasonable and, probably, minimizes the amount of code that needs
>> to be changed. This will also enable us to fix each UDC in isolation.
>
> Right.  Okay, I'll work on a patch series.

Thank you, Alan.

-- 
balbi

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux