Re: [PATCH] usb: gadget: uvc: Move usb_ep_disable() to uvcg_video_enable()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Avichal

On Fri, Sep 08, 2023 at 11:54:40PM +0800, Avichal Rakesh wrote:
Apologies for the late reply, I have been out travelling.
On Mon, Sep 4, 2023 at 10:36 PM Michael Grzeschik <mgr@xxxxxxxxxxxxxx> wrote:
Cc'ing: Thinh Nguyen <Thinh.Nguyen@xxxxxxxxxxxx>

On Wed, Aug 30, 2023 at 01:38:23PM -0700, Avichal Rakesh wrote:
>On 6/15/23 10:15, Laurent Pinchart wrote:
>> On Thu, Jun 15, 2023 at 10:34:06AM +0100, Daniel Scally wrote:
>>> Currently the UVC Gadget's .set_alt() callback disables the USB
>>> endpoint, following which a V4L2 event is queued that closes
>>> down the workqueue. This ordering results in repeated calls to
>>> usb_ep_queue() from the workqueue whilst usb_ep_disable() is
>>> running - behaviour that the documentation for usb_ep_disable()
>>> specifically prohibits.
>>>
>>> Move the call to usb_ep_disable() until after cancel_work_sync(),
>>> which will guarantee the endpoint is no longer in use when
>>> usb_ep_disable() is called.
>>>
>>> Signed-off-by: Daniel Scally <dan.scally@xxxxxxxxxxxxxxxx>
>>> ---
>>>  drivers/usb/gadget/function/f_uvc.c     | 3 ---
>>>  drivers/usb/gadget/function/uvc_video.c | 4 ++++
>>>  2 files changed, 4 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/usb/gadget/function/f_uvc.c b/drivers/usb/gadget/function/f_uvc.c
>>> index 5e919fb65833..4b91bd572a83 100644
>>> --- a/drivers/usb/gadget/function/f_uvc.c
>>> +++ b/drivers/usb/gadget/function/f_uvc.c
>>> @@ -337,9 +337,6 @@ uvc_function_set_alt(struct usb_function *f, unsigned interface, unsigned alt)
>>>             if (uvc->state != UVC_STATE_STREAMING)
>>>                     return 0;
>>>
>>> -           if (uvc->video.ep)
>>> -                   usb_ep_disable(uvc->video.ep);
>>> -
>>>             memset(&v4l2_event, 0, sizeof(v4l2_event));
>>>             v4l2_event.type = UVC_EVENT_STREAMOFF;
>>>             v4l2_event_queue(&uvc->vdev, &v4l2_event);
>>
>> If we don't disable the endpoint here, should we return
>> USB_GADGET_DELAYED_STATUS here (and call uvc_function_setup_continue()
>> in uvc_v4l2_streamoff()) or is that not needed ? The uvc->state updated
>> should then possibly be moved to uvc_v4l2_streamoff() (checking if this
>> would cause any race condition would also be a nice bonus :-)).
>>
>
>Hey all,
>
>First off, apologies for reviving an old thread. We've also been seeing the
>"no resource for ep2in" warning from dwc3 controller, followed by the UVC gadget
>not streaming any frames, when there is a quick STREAMOFF->STREAMON sequence.
>
>It looks like this is the same root cause as what Dan mentioned in
>https://lore.kernel.org/20230531085544.253363-1-dan.scally@xxxxxxxxxxxxxxxx/
>and this patch seems to solve. (Thank you Dan, for posting the stacktrace in
>that email thread! I had been banging my head for a couple of days before
>thinking of looking up the exact stack :|)
>
>IIUC, this stems from workqueue not fully shutting down by the time the usb
>endpoint is disabled and we need stronger guarantees that the workqueue pumping
>usb_requests doesn't accidentally queue usb_requests _after_ we've disabled
>the streaming endpoint on the usb controller.
>
>Attached is a patch that attempts to address the concerns raised here and sets
>up some stronger timing and memory guarantees.
>
>So here are the list of changes over what Dan had already started:
>
> - Return USB_GADGET_DELAYED_STATUS from set_alt. This is to ensure there are no
>   more control transfers while we're waiting on the workqueue to wind down.
>
> - Move updating uvc->state to uvc_v4l2_streamoff as Laurent suggested.
>   This ensures that setting uvc->state to STREAMING or CONNECTED
>   happens as a result of streamoff and streamon events which makes things
>   easier to reason about.
>
> - Guard video_pump with uvc->state so the thread can be stopped by setting
>   uvc->state to anything other than UVC_STATE_STREAMING. This effectively makes
>   uvc->state a flag for the video_pump loop to exit. This is the same flag that
>   the complete callback uses to restart the workqueue, so all interactions
>   with the controller are effectively guarded by the same flag now.
>
> - Set uvc->state to UVC_STATE_CONNECTED before winding down the work queue.
>   Now that all usb logic is guarded by the same flag, setting the flag should
>   stop all usb packet queuing once current execution finishes.
>
> - Add some memory guarantees in uvcg_video_enable(). At the moment, we don't
>   really consider the fact that usb_ep_dequeue is async, which means that the
>   usb_requests may not have been returned to the gadget before we start
>   deallocating them. With this patch, we wait until all usb_requests have been
>   returned to us before deallocating them (this adds a little bit of
>   bookkeeping, but nothing too complicated).

I am currently trying to solve that by preparing a patch that is
fixing the use of the requests when deallocating them. Since currently
the uvc_gadget is also running into wild use after free issues because
of exactly that async dequeue and dealloc situation.

Do you already have a patch up for this? It seems my LKML-fu is
failing and I can't seem to find the thread. If you aren't too deep
into the patch, can you take a look at the request counting mechanism
added in my patch? If you have a (somewhat) consistent repro of the
use-after-dealloc issue, runnin it through the whole patch would be
very appreciated! It is supposed to fix the exact problem you've
described.

I just send out v1:

https://lore.kernel.org/linux-usb/20230911002451.2860049-1-m.grzeschik@xxxxxxxxxxxxxx/

My patches did go back and forth between changes in the uvc-gadget
driver and the device-controller driver. My latest version was including
calling free_request from the complete handler. I found that option
while looking into the uac2 gadget code. It took away a lot of
pain while trying to fix the issue in the dwc3 gadget driver.

IMHO it should be
save to call dealloc after calling dequeue. Which is probably true for
other usb device controller driver other then dwc3.

Perhaps Thinh or someone better versed in Gadget API can chime in on
this, but as it stands usb_ep_dequeue specifically says that it is
async, and gadget drivers must wait on the complete callbacks to
regain ownership of the usb_request. Until the API change is made, UVC
should adhere to the current API?

Since you mention that usb_ep_dequeue is async I am very confident
that it is safe to free the request in the completion handler.

Although we could cleanup and improve the uvc_video_free_requests
function itself. But with the patches I have here the use
after free was gone so far. So they should be good so far.

For some background. The dwc3 is putting the requests into a cancelled list
that will be cleared by the interrupt handler and that is dequeuing them
instead. In between the dequeue call and the interrupt call the uvc layer could
dealloc the request which leads the interrupt handler to dequeue an
already freed request.

This roughly tracks with what I gleaned from skimming the DWC3 code as
well. In local tests the complete calls were always timely and I never
actually ran into the situation where UVC deallocated an unowned
request, but as someone (I think it was Alan?)  said in a previous
thread: technically possible just means it will happen eventually

Please do review/test the patch. I'll send out a formal patch on
Monday once I am back, but would love to have some early eyes take a
look in case there is something obvious I missed.

First I tested your patch with my sketchy setup where I more then once
run into the use after free condition. But with the patch this was not
gone.

I also looked over the patch. As what I saw this is a possible
alternative to my patches. The changes are doing some similar things.
But the code is changing to many things at once. Please split the code
up into more logical chunks. Perhaps you could try to rebase it on
my patches. And start from there.

Thanks,
Michael

--
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux