Thank you both, Dan and Alan, for your comments! On Fri, Jul 21, 2023 at 12:32 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > On Fri, Jul 21, 2023 at 04:57:51PM +0100, Dan Scally wrote: > > Hi Avichal - thanks for all the detail > > > > > A dirty Patch for option 2 is attached below which skips calling > > > usb_function_deactivate if uvc_function_disable was called before. It seems > > > to work okay in testing. Let me know if the analysis and solutions seems okay > > > and I can upload a formal patch. > > > > > > For what it's worth the analysis makes sense; the patch looks ok to me so if > > the conclusion is to fix the problem that way I think it's fine, but I'm > > more inclined to consider this a locking problem in core - it'd be better to > > fix it there I think. > > I'm not so sure that handling this in the core is feasible. Removing > the driver obviously needs to be synchronized with deactivation, since > the two actions affect the same parts of the state (i.e., the pull-ups > and the "connected" flag). I don't have the full context on what caused the locking to be added, but now that it in place, it seems like there needs to be a clarification of expectation between core and the gadget drivers. Is it valid for the gadget drivers to call usb_gadget_deactivate (and similar functions) as a part of disable/unbind (in terms of API/expectations)? 1. If yes, maybe core can track when it is in the middle of resetting and drop calls to usb_gadget_deactivate if called in the middle of the disconnect--->unbind sequence. This is effectively what the patch above does in UVC driver, but core might (with some extra state) have stronger guarantees of when a call is redundant and can be safely dropped. 2. If no, then it becomes the gadget's responsibility to ensure that it doesn't call any of the usb_gadget_* functions when disabling/unbinding. However, it does require core to provide some concrete rules around when things are safe to call, and when they aren't. > > Consequently I don't see how to avoid a deadlock if the driver's unbind > callback does a deactivate. Besides, as the patch mentions, doing so is > never necessary. > > However, even with that call removed we could still have a problem. I > don't know much about how the UVC function driver works, but it would be > reasonable for the driver to have a private mutex that gets held both > during unbind and when the user application closes the V4L2 node. Then > there's an ABBA locking issue: > > Unbind: The UDC core holds connect_lock while calling the UVC > unbind handler, which needs to acquire the private mutex. > > Close node: The UVC driver holds the private mutex while doing > a deactivate, which needs to acquire connect_lock. > > Any ideas on how to clear this up? > I think my question above gives us two options out based on the answer: 1. Core handling redundant calls is the more bullet-proof solution IMO. It means that the gadget driver never holds connect_lock when it shouldn't. No more ABBA! One potential implementation is to track when core is resetting in a protected flag. All functions related to resetting/disconnecting would check the flag before locking connect_lock and would become no-ops if gadget is in the middle of resetting. 2. Some stronger guarantees will let the gadget driver's state machine decide if it can call usb_gadget_* functions. For example, if we can say for sure that disable call will always be followed by the unbind call, and that usb_gadget_* functions are disallowed between the two, then UVC driver can handle ABBA situation by tracking when it is between a disable and unbind call and skip calling usb_gadget_* function until unbind finishes. The downside of (2), is that a poorly written (or malicious) gadget driver can grind the gadget to a halt with a somewhat simple deadlock. Unfortunately, I am travelling over the next week, but I'll try to create and attach a dirty patch for core to handle redundant calls to usb_gadget_* over the next week. I am fairly new and don't know the full semantics around core, so if I am missing something simple here, please do let me know! Regards, Avi