Am 03.11.22 um 23:16 schrieb Nicolas Dufresne:
[SNIP]
We already had numerous projects where we reported this practice as bugs
to the GStreamer and FFMPEG project because it won't work on x86 with dGPUs.
Links ? Remember that I do read every single bugs and emails around GStreamer
project. I do maintain older and newer V4L2 support in there. I also did
contribute a lot to the mechanism GStreamer have in-place to reverse the
allocation. In fact, its implemented, the problem being that on generic Linux,
the receiver element, like the GL element and the display sink don't have any
API they can rely on to allocate memory. Thus, they don't implement what we call
the allocation offer in GStreamer term. Very often though, on other modern OS,
or APIs like VA, the memory offer is replaced by a context. So the allocation is
done from a "context" which is neither an importer or an exporter. This is
mostly found on MacOS and Windows.
Was there APIs suggested to actually make it manageable by userland to allocate
from the GPU? Yes, this what Linux Device Allocator idea is for. Is that API
ready, no.
Well, that stuff is absolutely ready:
https://elixir.bootlin.com/linux/latest/source/drivers/dma-buf/heaps/system_heap.c#L175
What do you think I'm talking about all the time?
DMA-buf has a lengthy section about CPU access to buffers and clearly
documents how all of that is supposed to work:
https://elixir.bootlin.com/linux/latest/source/drivers/dma-buf/dma-buf.c#L1160
This includes braketing of CPU access with dma_buf_begin_cpu_access()
and dma_buf_end_cpu_access(), as well as transaction management between
devices and the CPU and even implicit synchronization.
This specification is then implemented by the different drivers
including V4L2:
https://elixir.bootlin.com/linux/latest/source/drivers/media/common/videobuf2/videobuf2-dma-sg.c#L473
As well as the different DRM drivers:
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c#L117
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c#L234
This design was then used by us with various media players on different
customer projects, including QNAP https://www.qnap.com/en/product/ts-877
as well as the newest Tesla
https://www.amd.com/en/products/embedded-automotive-solutions
I won't go into the details here, but we are using exactly the approach
I've outlined to let userspace control the DMA between the different
device in question. I'm one of the main designers of that and our
multimedia and mesa team has up-streamed quite a number of changes for
this project.
I'm not that well into different ARM based solutions because we are just
recently getting results that this starts to work with AMD GPUs, but I'm
pretty sure that the design should be able to handle that as well.
So we have clearly prove that this design works, even with special
requirements which are way more complex than what we are discussing
here. We had cases where we used GStreamer to feed DMA-buf handles into
multiple devices with different format requirements and that seems to
work fine.
-----
But enough of this rant. As I wrote Lucas as well this doesn't help us
any further in the technical discussion.
The only technical argument I have is that if some userspace
applications fail to use the provided UAPI while others use it correctly
then this is clearly not a good reason to change the UAPI, but rather an
argument to change the applications.
If the application should be kept simple and device independent then
allocating the buffer from the device independent DMA heaps would be
enough as well. Cause that provider implements the necessary handling
for dma_buf_begin_cpu_access() and dma_buf_end_cpu_access().
I'm a bit surprised that we are arguing about stuff like this because we
spend a lot of effort trying to document this. Daniel gave me the job to
fix this documentation, but after reading through it multiple times now
I can't seem to find where the design and the desired behavior is unclear.
What is clearly a bug in the kernel is that we don't reject things which
won't work correctly and this is what this patch here addresses. What we
could talk about is backward compatibility for this patch, cause it
might look like it breaks things which previously used to work at least
partially.
Regards,
Christian.