On Fri, Jul 21, 2023 at 03:44:52PM -0700, Avichal Rakesh wrote: > Thank you both, Dan and Alan, for your comments! > > On Fri, Jul 21, 2023 at 12:32 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > > > On Fri, Jul 21, 2023 at 04:57:51PM +0100, Dan Scally wrote: > > > Hi Avichal - thanks for all the detail > > > > > > > A dirty Patch for option 2 is attached below which skips calling > > > > usb_function_deactivate if uvc_function_disable was called before. It seems > > > > to work okay in testing. Let me know if the analysis and solutions seems okay > > > > and I can upload a formal patch. > > > > > > > > > For what it's worth the analysis makes sense; the patch looks ok to me so if > > > the conclusion is to fix the problem that way I think it's fine, but I'm > > > more inclined to consider this a locking problem in core - it'd be better to > > > fix it there I think. > > > > I'm not so sure that handling this in the core is feasible. Removing > > the driver obviously needs to be synchronized with deactivation, since > > the two actions affect the same parts of the state (i.e., the pull-ups > > and the "connected" flag). > > I don't have the full context on what caused the locking to be added, > but now that it > in place, it seems like there needs to be a clarification of > expectation between core > and the gadget drivers. Is it valid for the gadget drivers to call > usb_gadget_deactivate (and similar functions) as a part of disable/unbind > (in terms of API/expectations)? > > 1. If yes, maybe core can track when it is in the middle of resetting and > drop calls to usb_gadget_deactivate if called in the middle of the > disconnect--->unbind sequence. This is effectively what the patch above > does in UVC driver, but core might (with some extra state) have stronger > guarantees of when a call is redundant and can be safely dropped. > > 2. If no, then it becomes the gadget's responsibility to ensure that it doesn't > call any of the usb_gadget_* functions when disabling/unbinding. However, it > does require core to provide some concrete rules around when things are safe > to call, and when they aren't. > > > > > Consequently I don't see how to avoid a deadlock if the driver's unbind > > callback does a deactivate. Besides, as the patch mentions, doing so is > > never necessary. > > > > However, even with that call removed we could still have a problem. I > > don't know much about how the UVC function driver works, but it would be > > reasonable for the driver to have a private mutex that gets held both > > during unbind and when the user application closes the V4L2 node. Then > > there's an ABBA locking issue: > > > > Unbind: The UDC core holds connect_lock while calling the UVC > > unbind handler, which needs to acquire the private mutex. > > > > Close node: The UVC driver holds the private mutex while doing > > a deactivate, which needs to acquire connect_lock. > > > > Any ideas on how to clear this up? > > > > I think my question above gives us two options out based on the answer: > > 1. Core handling redundant calls is the more bullet-proof solution IMO. It > means that the gadget driver never holds connect_lock when it shouldn't. > No more ABBA! > > One potential implementation is to track when core is resetting in a protected > flag. All functions related to resetting/disconnecting would check the > flag before > locking connect_lock and would become no-ops if gadget is in the middle of > resetting. > > 2. Some stronger guarantees will let the gadget driver's state machine decide > if it can call usb_gadget_* functions. For example, if we can say for sure that > disable call will always be followed by the unbind call, and that usb_gadget_* > functions are disallowed between the two, then UVC driver can handle ABBA > situation by tracking when it is between a disable and unbind call and skip > calling usb_gadget_* function until unbind finishes. > > The downside of (2), is that a poorly written (or malicious) gadget driver can > grind the gadget to a halt with a somewhat simple deadlock. > > Unfortunately, I am travelling over the next week, but I'll try to > create and attach > a dirty patch for core to handle redundant calls to usb_gadget_* over the next > week. > > I am fairly new and don't know the full semantics around core, so if I > am missing > something simple here, please do let me know! Here's a proposal, along the lines of your first suggestion above. The idea is to avoid holding the connect_lock mutex while invoking a gadget driver's callbacks. Unfortunately, this is unavoidable in the case of the ->disconnect() callback, but that's okay because the kerneldoc already says that ->disconnect() may be called in_interrupt, so it's not allowed to call any core routines that may sleep. Just to make this perfectly clear, the patch adds a couple of comments along these lines. As far as I can tell, there is no real reason to hold connect_lock during the ->unbind() callback. Not doing so should fix the problem in the UVC function driver. Let me know if this works. Alan Stern drivers/usb/gadget/udc/core.c | 9 +++++++++ 1 file changed, 9 insertions(+) Index: usb-devel/drivers/usb/gadget/udc/core.c =================================================================== --- usb-devel.orig/drivers/usb/gadget/udc/core.c +++ usb-devel/drivers/usb/gadget/udc/core.c @@ -822,6 +822,9 @@ EXPORT_SYMBOL_GPL(usb_gadget_disconnect) * usb_gadget_activate() is called. For example, user mode components may * need to be activated before the system can talk to hosts. * + * This routine may sleep; it must not be called in interrupt context + * (such as from within a gadget driver's disconnect() callback). + * * Returns zero on success, else negative errno. */ int usb_gadget_deactivate(struct usb_gadget *gadget) @@ -860,6 +863,8 @@ EXPORT_SYMBOL_GPL(usb_gadget_deactivate) * This routine activates gadget which was previously deactivated with * usb_gadget_deactivate() call. It calls usb_gadget_connect() if needed. * + * This routine may sleep; it must not be called in interrupt context. + * * Returns zero on success, else negative errno. */ int usb_gadget_activate(struct usb_gadget *gadget) @@ -1639,7 +1644,11 @@ static void gadget_unbind_driver(struct usb_gadget_disable_async_callbacks(udc); if (gadget->irq) synchronize_irq(gadget->irq); + mutex_unlock(&udc->connect_lock); + udc->driver->unbind(gadget); + + mutex_lock(&udc->connect_lock); usb_gadget_udc_stop_locked(udc); mutex_unlock(&udc->connect_lock);