Hi Avichal - thanks for all the detail
On 20/07/2023 02:28, Avichal Rakesh wrote:
Hey all,
I recently ran into a kernel panic when testing the UVC Gadget Driver.
The device ramdumps with the following stack when removing the UVC config from
configfs:
KP: Oops - BUG: Fatal exception: comm:Thread-6 PC:__list_del_entry_valid+0xb0/0xc4 LR:__list_del_entry_valid+0xb0/0xc4
PC: __list_del_entry_valid+0xb0 <FFFFFFE685330294>
LR: __list_del_entry_valid+0xb0 <FFFFFFE685330294>
[<FFFFFFE685330294>] __list_del_entry_valid+0xb0
[<FFFFFFE6857E50AC>] v4l2_fh_del+0x78
[<FFFFFFE685769774>] uvc_v4l2_release+0xd0
[<FFFFFFE6857D9B10>] v4l2_release+0xcc
[<FFFFFFE684EE192C>] __fput+0xf8
[<FFFFFFE684EE17CC>] ____fput+0xc
[<FFFFFFE684B5C9E0>] task_work_run+0x138
This looks like a side effect of
https://lore.kernel.org/lkml/20230608204517.105396-1-badhri@xxxxxxxxxx/.
Effectively, UVC function tried to disconnect the gadget before
cleaning up resources. However, usb_gadget_unregister_driver which is
removing the function prevents the gadget from disconnecting until the
function is unbound.
As of the patch mentioned above, gadget_unbind_driver holds
udc->connect_lock and calls both usb_gadget_disconnect_locked and
udc->driver->unbind one after the other.
usb_gadget_disconnect_locked calls into UVC Gadget driver as follows:
1. usb_gadget_disconnect_locked
2. configfs_composite_disconnect
3. __composite_disconnect
4. uvc_function_disable
udc->driver->unbind calls into UVC Gadget driver as follows:
1. udc->driver->unbind
2. configfs_composite_unbind
3. purge_configs_funcs
4. uvc_function_unbind
uvc_function_disable notifies the userspace application with
UVC_EVENT_DISCONNECTED which causes the V4L2 node to be released
(or unsubscribed to). Either on unsubscribe or on release, the UVC Gadget
Driver calls uvc_function_disconnect before cleaning up resources. Following
is the problematic stack from uvc_v4l2_disable.
1. uvc_v4l2_disable
2. uvc_function_disconnect
3. usb_function_deactivate
4. usb_gadget_deactivate
usb_gadget_deactivate attempts to lock udc->connect_lock as of the patch
mentioned above.
This means that attempting to unregister the UVC Gadget Driver results in the
V4L2 resource cleanup being stuck behind udc->connect_lock, which will only
be released after uvc_function_unbind finishes. This results in either the
gadget deactivating after the unbind process has finished, or in a Kernel Panic
as it tries to cleanup a V4L2 node that has been purged.
This leaves us with two options:
1. We "fix" the locking in core.c to restore old behavior, and let the
usb_gadget_deactivate call go through without locking. However,
I am not sure about the specifics of the patch were and what exact issue it
was trying to fix.
Badhri, would you know if relaxing the constraints on
usb_gadget_deactivate is feasible? It is possible that other gadget drivers
run into similar issues as UVC driver.
3. UVC Gadget Driver calls usb_function_deactivate to take down the gadget if
the userspace application stops listening to the V4L2 node. However, AFAICT
disable is called as a part of the gadget resetting. So, if the V4L2 node
is released because of UVC_EVENT_DISCONNECT, we can skip calling
usb_function_deactivate as the gadget will be reset anyway.
usb_function documentation seems to agree that if 'disable' is called,
the gadget will be reset/reconfigured shortly:
@disable: (REQUIRED) Indicates the function should be disabled. Reasons
* include host resetting or reconfiguring the gadget, and disconnection.
A dirty Patch for option 2 is attached below which skips calling
usb_function_deactivate if uvc_function_disable was called before. It seems
to work okay in testing. Let me know if the analysis and solutions seems okay
and I can upload a formal patch.
For what it's worth the analysis makes sense; the patch looks ok to me so if the conclusion is to
fix the problem that way I think it's fine, but I'm more inclined to consider this a locking problem
in core - it'd be better to fix it there I think.
Thank you!
---
drivers/usb/gadget/function/f_uvc.c | 12 ++++++++++--
drivers/usb/gadget/function/uvc.h | 1 +
2 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c
index 5e919fb65833..cef92243f1f7 100644
--- a/drivers/usb/gadget/function/f_uvc.c
+++ b/drivers/usb/gadget/function/f_uvc.c
@@ -385,7 +385,7 @@ uvc_function_disable(struct usb_function *f)
v4l2_event.type = UVC_EVENT_DISCONNECT;
v4l2_event_queue(&uvc->vdev, &v4l2_event);
- uvc->state = UVC_STATE_DISCONNECTED;
+ uvc->state = UVC_STATE_HOST_DISCONNECTED;
usb_ep_disable(uvc->video.ep);
if (uvc->enable_interrupt_ep)
@@ -410,8 +410,16 @@ uvc_function_disconnect(struct uvc_device *uvc)
{
int ret;
- if ((ret = usb_function_deactivate(&uvc->func)) < 0)
+ if (uvc->state == UVC_STATE_HOST_DISCONNECTED) {
+ /*
+ * Don't deactivate gadget as this is being called in
+ * response to the host resetting. Gadget will be deactivated
+ * anyway. Just update to state as acknowledgement
+ */
+ uvc->state = UVC_STATE_DISCONNECTED;
+ } else if ((ret = usb_function_deactivate(&uvc->func)) < 0) {
uvcg_info(&uvc->func, "UVC disconnect failed with %d\n", ret);
+ }
}
/* --------------------------------------------------------------------------
diff --git a/drivers/usb/gadget/function/uvc.h b/drivers/usb/gadget/function/uvc.h
index 100475b1363e..f1e2bc98dc61 100644
--- a/drivers/usb/gadget/function/uvc.h
+++ b/drivers/usb/gadget/function/uvc.h
@@ -120,6 +120,7 @@ struct uvc_video {
};
enum uvc_state {
+ UVC_STATE_HOST_DISCONNECTED,
UVC_STATE_DISCONNECTED,
UVC_STATE_CONNECTED,
UVC_STATE_STREAMING,