Hi Hans, On Fri, Jun 07, 2019 at 04:11:20PM +0200, Hans Verkuil wrote: > On 6/7/19 3:55 PM, Marek Szyprowski wrote: > > Hi Hans, > > > > On 2019-06-07 15:40, Hans Verkuil wrote: > >> On 6/7/19 2:47 PM, Hans Verkuil wrote: > >>> On 6/7/19 2:23 PM, Hans Verkuil wrote: > >>>> On 6/7/19 2:14 PM, Marek Szyprowski wrote: > >>>>> On 2019-06-07 14:01, Hans Verkuil wrote: > >>>>>> On 6/7/19 1:16 PM, Laurent Pinchart wrote: > >>>>>>> Thank you for the patch. > >>>>>>> > >>>>>>> On Fri, Jun 07, 2019 at 10:45:31AM +0200, Hans Verkuil wrote: > >>>>>>>> The __prepare_userptr() function made the incorrect assumption that if the > >>>>>>>> same user pointer was used as the last one for which memory was acquired, then > >>>>>>>> there was no need to re-acquire the memory. This assumption was never properly > >>>>>>>> tested, and after doing that it became clear that this was in fact wrong. > >>>>>>> Could you explain in the commit message why the assumption is not > >>>>>>> correct ? > >>>>>> You can free the memory, then allocate it again and you can get the same pointer, > >>>>>> even though it is not necessarily using the same physical pages for the memory > >>>>>> that the kernel is still using for it. > >>>>>> > >>>>>> Worse, you can free the memory, then allocate only half the memory you need and > >>>>>> get back the same pointer. vb2 wouldn't notice this. And it seems to work (since > >>>>>> the original mapping still remains), but this can corrupt userspace memory > >>>>>> causing the application to crash. It's not quite clear to me how the memory can > >>>>>> get corrupted. I don't know enough of those low-level mm internals to understand > >>>>>> the sequence of events. > >>>>>> > >>>>>> I have test code for v4l2-compliance available if someone wants to test this. > >>>>> I'm interested, I would really like to know what happens in the mm > >>>>> subsystem in such case. > >>>> Here it is: > >>>> > >>>> diff --git a/utils/v4l2-compliance/v4l2-test-buffers.cpp b/utils/v4l2-compliance/v4l2-test-buffers.cpp > >>>> index be606e48..9abf41da 100644 > >>>> --- a/utils/v4l2-compliance/v4l2-test-buffers.cpp > >>>> +++ b/utils/v4l2-compliance/v4l2-test-buffers.cpp > >>>> @@ -797,7 +797,7 @@ int testReadWrite(struct node *node) > >>>> return 0; > >>>> } > >>>> > >>>> -static int captureBufs(struct node *node, const cv4l_queue &q, > >>>> +static int captureBufs(struct node *node, cv4l_queue &q, > >>>> const cv4l_queue &m2m_q, unsigned frame_count, int pollmode, > >>>> unsigned &capture_count) > >>>> { > >>>> @@ -962,6 +962,21 @@ static int captureBufs(struct node *node, const cv4l_queue &q, > >>>> buf.s_flags(V4L2_BUF_FLAG_REQUEST_FD); > >>>> buf.s_request_fd(buf_req_fds[req_idx]); > >>>> } > >>>> + if (v4l_type_is_capture(buf.g_type()) && q.g_memory() == V4L2_MEMORY_USERPTR) { > >>>> + printf("\nidx: %d", buf.g_index()); > >>>> + for (unsigned p = 0; p < q.g_num_planes(); p++) { > >>>> + printf(" old buf[%d]: %p ", p, buf.g_userptr(p)); > >>>> + fflush(stdout); > >>>> + free(buf.g_userptr(p)); > >>>> + void *m = calloc(1, q.g_length(p)/2); > >>>> + > >>>> + fail_on_test(m == NULL); > >>>> + q.s_userptr(buf.g_index(), p, m); > >>>> + printf("new buf[%d]: %p", p, m); > >>>> + buf.s_userptr(m, p); > >>>> + } > >>>> + printf("\n"); > >>>> + } > >>>> fail_on_test(buf.qbuf(node, q)); > >>>> fail_on_test(buf.g_flags() & V4L2_BUF_FLAG_DONE); > >>>> if (buf.g_flags() & V4L2_BUF_FLAG_REQUEST_FD) { > >>>> > >>>> > >>>> > >>>> Load the vivid driver and just run 'v4l2-compliance -s10' and you'll see: > >>>> > >>>> ... > >>>> Streaming ioctls: > >>>> test read/write: OK > >>>> test blocking wait: OK > >>>> test MMAP (no poll): OK > >>>> test MMAP (select): OK > >>>> test MMAP (epoll): OK > >>>> Video Capture: Frame #000 > >>>> idx: 0 old buf[0]: 0x7f71c6e7c010 new buf[0]: 0x7f71c6eb4010 > >>>> Video Capture: Frame #001 > >>>> idx: 1 old buf[0]: 0x7f71c6e0b010 new buf[0]: 0x7f71c6e7b010 > >>>> Video Capture: Frame #002 > >>>> idx: 0 old buf[0]: 0x7f71c6eb4010 free(): invalid pointer > >>>> Aborted > >>> To clarify: two full size buffers are allocated and queued (that happens in setupUserPtr()), > >>> then streaming starts and captureBufs is called which basically just calls dqbuf > >>> and qbuf. > >>> > >>> Tomasz pointed out that all the pointers in this log are actually different. That's > >>> correct, but here is a log where the old and new buf ptr are the same: > >>> > >>> Streaming ioctls: > >>> test read/write: OK > >>> test blocking wait: OK > >>> test MMAP (no poll): OK > >>> test MMAP (select): OK > >>> test MMAP (epoll): OK > >>> Video Capture: Frame #000 > >>> idx: 0 old buf[0]: 0x7f1094e16010 new buf[0]: 0x7f1094e4e010 > >>> Video Capture: Frame #001 > >>> idx: 1 old buf[0]: 0x7f1094da5010 new buf[0]: 0x7f1094e15010 > >>> Video Capture: Frame #002 > >>> idx: 0 old buf[0]: 0x7f1094e4e010 new buf[0]: 0x7f1094e4e010 > >>> Video Capture: Frame #003 > >>> idx: 1 old buf[0]: 0x7f1094e15010 free(): invalid pointer > >>> Aborted > >>> > >>> It's weird that the first log fails that way: if the pointers are different, > >>> then vb2 will call get_userptr and it should discover that the buffer isn't > >>> large enough, causing qbuf to fail. That doesn't seem to happen. > >> I think that the reason for this corruption is that the memory pool used > >> by glibc is now large enough for vb2 to think it can map the full length > >> of the user pointer into memory, even though only the first half is actually > >> from the buffer that's allocated. When you capture a frame you just overwrite > >> a random part of the application's memory pool, causing this invalid pointer. > >> > >> But that's a matter of garbage in, garbage out. So that's not the issue here. > >> > >> The real question is what happens when you free the old buffer, allocate a > >> new buffer, end up with the same userptr, but it's using one or more different > >> pages for its memory compared to the mapping that the kernel uses. > >> > >> I managed to reproduce this with v4l2-ctl: > >> > >> diff --git a/utils/v4l2-ctl/v4l2-ctl-streaming.cpp b/utils/v4l2-ctl/v4l2-ctl-streaming.cpp > >> index 28b2b3b9..8f2ed9b5 100644 > >> --- a/utils/v4l2-ctl/v4l2-ctl-streaming.cpp > >> +++ b/utils/v4l2-ctl/v4l2-ctl-streaming.cpp > >> @@ -1422,6 +1422,24 @@ static int do_handle_cap(cv4l_fd &fd, cv4l_queue &q, FILE *fout, int *index, > >> * has the size that fits the old resolution and might not > >> * fit to the new one. > >> */ > >> + if (q.g_memory() == V4L2_MEMORY_USERPTR) { > >> + printf("\nidx: %d", buf.g_index()); > >> + for (unsigned p = 0; p < q.g_num_planes(); p++) { > >> + unsigned *pb = (unsigned *)buf.g_userptr(p); > >> + printf(" old buf[%d]: %p first pixel: 0x%x", p, buf.g_userptr(p), *pb); > >> + fflush(stdout); > >> + free(buf.g_userptr(p)); > >> + void *m = calloc(1, q.g_length(p)); > >> + > >> + if (m == NULL) > >> + return QUEUE_ERROR; > >> + q.s_userptr(buf.g_index(), p, m); > >> + if (m == buf.g_userptr(p)) > >> + printf(" identical new buf"); > >> + buf.s_userptr(m, p); > >> + } > >> + printf("\n"); > >> + } > >> if (fd.qbuf(buf) && errno != EINVAL) { > >> fprintf(stderr, "%s: qbuf error\n", __func__); > >> return QUEUE_ERROR; > >> > >> > >> Load vivid, setup a pure white test pattern: > >> > >> v4l2-ctl -c test_pattern=6 > >> > >> Now run v4l2-ctl --stream-user and you'll see: > >> > >> idx: 0 old buf[0]: 0x7f91551cb010 first pixel: 0x80ea80ea identical new buf > >> < > >> idx: 1 old buf[0]: 0x7f915515a010 first pixel: 0x80ea80ea identical new buf > >> < > >> idx: 2 old buf[0]: 0x7f91550e9010 first pixel: 0x80ea80ea identical new buf > >> < > >> idx: 3 old buf[0]: 0x7f9155078010 first pixel: 0x80ea80ea identical new buf > >> < > >> idx: 0 old buf[0]: 0x7f91551cb010 first pixel: 0x0 identical new buf > >> < > >> idx: 1 old buf[0]: 0x7f915515a010 first pixel: 0x0 identical new buf > >> < 5.00 fps > >> > >> idx: 2 old buf[0]: 0x7f91550e9010 first pixel: 0x0 identical new buf > >> < > >> idx: 3 old buf[0]: 0x7f9155078010 first pixel: 0x0 identical new buf > >> > >> The first four dequeued buffers are filled with data, after that the > >> returned buffer is empty because vivid is actually writing to different > >> memory pages. > >> > >> With this patch the first pixel is always non-zero. > > > > Good catch. The question is weather we treat that as undefined behavior > > and keep the optimization for 'good applications' or assume that every > > broken userspace code has to be properly handled. The good thing is that > > there is still imho no security issue. The physical pages gathered by > > Yeah, that scared me for a bit, but it all looks secure. > > > vb2 in worst case belongs to noone else (vb2 is their last user, they > > are not yet returned to free pages pool). > > I see three options: > > 1) just always reacquire the buffer, and if anyone complains about it > being slower we point them towards DMABUF. That doesn't really help right now as DMABUF has the same property: the pages are mapped and unmapped every time the buffer is touched by a device. That could be addressed though, it's not an inherent property of DMABUF buffer type. > > 2) keep the current behavior, but document it. I'd favour this. Ideally there should be a way to discard a buffer, and DESTROY_BUF has been proposed before. > > 3) as 2), but also add a new buffer flag that forces a reacquire of the > buffer. This could be valid for DMABUF as well. E.g.: > > V4L2_BUF_FLAG_REACQUIRE > > I'm leaning towards the third option since it won't slow down existing > implementations, yet if you do change the userptr every time, then you > can now force this to work safely. This would get around the problem of reallocating but in a use case specific way. DESTROY_BUF would be more generic. But I admit it would require some rework of vb2 codebase, and possibly drivers as well. So I think this is reasonable compromise. -- Kind regards, Sakari Ailus sakari.ailus@xxxxxxxxxxxxxxx