Re: [PATCH 2/3] usb: gadget: uvc: cleanup request when not in correct state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 18, 2023 at 04:40:07PM -0700, Avichal Rakesh wrote:


On 9/18/23 14:43, Michael Grzeschik wrote:
On Mon, Sep 18, 2023 at 12:02:11PM -0700, Avichal Rakesh wrote:
On 9/16/23 16:23, Michael Grzeschik wrote:
On Fri, Sep 15, 2023 at 07:41:05PM -0700, Avichal Rakesh wrote:
On 9/15/23 16:32, Michael Grzeschik wrote:
On Mon, Sep 11, 2023 at 09:52:22PM -0700, Avichal Rakesh wrote:
On 9/10/23 17:24, Michael Grzeschik wrote:
The uvc_video_enable function of the uvc-gadget driver is dequeing and
immediately deallocs all requests on its disable codepath. This is not
save since the dequeue function is async and does not ensure that the
requests are left unlinked in the controller driver.

By adding the ep_free_request into the completion path of the requests
we ensure that the request will be properly deallocated.

Signed-off-by: Michael Grzeschik <m.grzeschik@xxxxxxxxxxxxxx>
---
 drivers/usb/gadget/function/uvc_video.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/usb/gadget/function/uvc_video.c b/drivers/usb/gadget/function/uvc_video.c
index 4b6e854e30c58c..52e3666b51f743 100644
--- a/drivers/usb/gadget/function/uvc_video.c
+++ b/drivers/usb/gadget/function/uvc_video.c
@@ -256,6 +256,12 @@ uvc_video_complete(struct usb_ep *ep, struct usb_request *req)
     struct uvc_device *uvc = video->uvc;
     unsigned long flags;

+    if (uvc->state == UVC_STATE_CONNECTED) {
+        usb_ep_free_request(video->ep, ureq->req);
nit: You can probably just call usb_ep_free_request with req instead of ureq->req.

Thanks, thats a good point.

+        ureq->req = NULL;
+        return;
+    }
+
     switch (req->status) {
     case 0:
         break;

Perhaps I am missing something here, but I am not sure how this alone
fixes the use-after-free issue. uvcg_video_enable still deallocates
_all_ usb_requests right after calling usb_ep_dequeue, so it is still
possible that an unreturned request is deallocated, and now it is
possible that the complete callback accesses a deallocated ureq :(

Since the issue I saw was usually coming from the list_del_entry_valid check in
the list_del_entry of the giveback function, the issue was probably just not
triggered anymore as the complete function did exit early.

So this fix alone is actually bogus without a second patch I had in the stack.
The second patch I am refering should change the actual overall issue:

https://lore.kernel.org/linux-usb/20230915233113.2903645-1-m.grzeschik@xxxxxxxxxxxxxx/T/#u

This early list_del and this patch here should ensure that the
concurrent functions are not handling already freed memory.

Oh, the patch linked above is interesting. It effectively force removes the dwc3_request
from whatever list it belongs to? So if DWC3's interrupt handler is delayed past
UVC gadget's ep_free_request call, then it won't see the requests in its cancelled
list at all. However, this setup is still prone to errors. For example, there is now
a chance that gadget_ep_free_request is called twice for one request. A scheduling
like the following might cause double kfree:

1. uvcg_video_enable calls usb_ep_dequeue for all usb_requests
2. While the usb_ep_dequeues are being processed, dwc3's interrupt handler starts
  calling the complete callbacks.
3. The complete callback calls gadget_ep_free_request (calling kfree as a result)
4. Meanwhile, uvcg_video_enable has moved to uvc_video_free_requests which also
  calls gadget_ep_free_request (calling kfree).

There is currently (even in your patches) no synchronization between calls to
gadget_ep_free_request via complete callback and uvcg_video_enable, which will
inevitably call usb_ep_free_request twice for one request.

Does that make sense, or am I misunderstanding some part of the patch?

The overall concept is correct. But in detail the
uvc_video_free_requests is checking that video->ureq[i].req is not NULL.

With our previous call of ep_free_request in the complete handler, the
ureq->req pointer in focus was already set to NULL. So the
uvc_video_free_requests function will skip that extra free.


Is there any form of synchronization between uvc_video_request and the
complete callback? As I see it, the dwc3 interrupt thread and the v4l2
ioctl thread (which calls uvcg_video_enable) are fully independent, so
the calls made by them are free to be interleaved arbitrarily, so an
interleaving like this is technically possible:

+------+------------------------------------+---------------------------------------------+
| time |            ioctl_thread            |            dwc3 interrupt handler           |
+======+====================================+=============================================+
|   1  | -uvc_v4l2_streamoff                |                                             |
|   2  | |-uvcg_video_enable                |                                             |
|   3  | ||-usb_ep_dequeue                  |                                             |
|   4  | ||                                 | -dwc3_process_event_buf                     |
|   5  | ||-uvc_video_free_requests         | |                                           |
|   6  | |||                                | |-dwc3_gadget_ep_cleanup_cancelled_requests |
|   7  | |||                                | ||-dwc3_gadget_giveback                     |
|   8  | |||                                | |||-uvc_video_complete                      |
|   9  | |||-check ureq->req != NULL [true] | ||||                                        |
|  10  | ||||-usb_ep_free_request           | ||||                                        |
|  11  | |||||-dwc3_ep_free_request         | ||||                                        |
|  12  | ||||||-kfree [first call]          | ||||                                        |
|  13  | ||||                               | ||||-usb_ep_free_request                    |
|  14  | ||||                               | |||||-dwc3_ep_free_request                  |
|  15  | ||||                               | ||||||-kfree [second call]                  |
|  16  | ||||                               | ||||-set ureq->req = NULL                   |
|  17  | ||||-set ureq->req = NULL          |                                             |
+------+------------------------------------+---------------------------------------------+

A situation like this means that dwc3_ep_free_request can be called
twice for a particular usb_request. This is obviously low probability,
but a race condition here means we'll start seeing very vague and hard
to repro crashes or memory inconsistencies when using the uvc gadget.

I do apologize if I've missed something obvious with your changes that
prevents such interleaving. I don't currently see any locking or
other synchronization mechanism in your changes. Is there something
in dwc3 that prevents this situation?

I think you have pointed it out totally clear. This is obviously the
case. It just did not trigger here. But the window is there and has to
be locked in some way.

For now we have two options to solve it.

1) Trying to avoid this double code path of the complete callback and
uvc_video_free_requests. This is what your patches are already doing.

But for now I am not so pleased with the timeout concept by waiting for
the complete interrupt to be called. This is also a shot in the dark as
the latency depends on the scheduler and the amount of potential
requests that are being handled.

I agree, a timeout is not the most elegant of solutions and given a
weird enough scheduler, will run into issues as well.


2) Locking both codepathes around the resource in question so the issue
is avoided.

However, I am also not a fried of many locks.

Perhaps it is possible to use a combination of wait_for_completion in
the uvc_video_free_requests and a complete callback in
uvc_video_complete for those requests that are not listed in the
req_free list.

What do you think?

There might be a way that builds on your idea of cleaning up in the complete callback.
It would rely on having a uvc_requests that aren't bulk allocated, which may have a
performance impact.

Since the allocation will only be done once, this performance impact is
should not be critical.

I am imagining something like the following:
 1. Instead of allocating a bulk of uvc_requests, we allocate them
    one at a time and add them to uvc_video.ureq
 2. uvc_video.ureq becomes a list_head containing all the individual
    requests
 3. We add a sentinel flag in uvc_request that says the request is
    now stale. This flag is protected by uvc_video->req_lock
 4. uvc_video_complete looks at  this flag to deallocate both
    usb_request and uvc_request.
 5. uvcg_video_enable looks something like the following:
      uvcg_video_enable(...) {
        ...
        lock(req_lock);
        forall (uvc_requests->ureqs) {ureq->stale = true}
        unlock(req_lock);
        usb_ep_dequeue all reqs

        uvc_video_free_requests(...)
        ...
      }
 6. uvc_video_complete looks something like:
      uvc_video_complete(...) {
        // at the start
        lock(req_lock)
        is_stale = ureq->stale;
        unlock(req_lock);

        if (is_stale) {
          usb_ep_free_request();
          dealloc corresponding uvc_request();
          return;
        }

        ...

        lock(req_lock);
        // possible that request became stale while we were handling stuff
        if (!ureq->stale) {
          list_add_tail(&req->list, &video->req_free);
        } else {
          usb_ep_free_request();
          dealloc corresponding uvc_request();
        }
        unlock(req_lock);
      }
 7. uvc_video_free_requests can freely dealloc usb_requests/uvc_requests in
    req_free because we can be certain that uvc_video_complete won't modify
    it once requests have been marked stale, and the stale requests in flight
    will be cleaned up by the complete callback.

Effectively, we freeze the state of req_free before dequeuing, and all
inflight requests are considered the responsibility of the complete handler
from that point onwards. The gadget is only responsible for freeing requests it
currently owns.

I think this should ensure that we never have a situation where the ownership of the
requests are undefined, and only one thread is responsible for freeing any given request.

Hope that makes sense!

So you found a way to secure this also with the already available
req_lock then. Nice!

Also what you suggest is to move from the array model we currently have
to dynamic allocation in a linked list.

I would suggest some more adaptions.

Keep to allocate all requests dynamicaly as you suggest instead of the
bulk array.

Rewrite the uvc_video_free_requests to iterate over the video->req_free
list instead of all available requests to take care of all requests
that are truely freed.

Take this patch we started this thread with and expand it to
clean up not only the usb_request but also the uvc_request
like you suggested in your pseudo code.

Since we check for UVC_STATE_CONNECTED already in the comlete handler
this is a superset of your stale flag anyway. And every request
that is currently in flight is not part of the req_free list, which
makes the uvc_video_free_requests function free to run without making
no harm.

Does this sound better?

Regards,
Michael


--
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux