Thanks for the detailed writeup, and it was good to meet you at XDC. Below:
On 10/18/2016 04:40 PM, Marek Olšák wrote:
Hi,
The text below describes how open source AMDGPU buffer sharing works.
I hope you'll find some useful bits in it.
Producer = allocates a buffer (or texture), and exports its handle
(DMABUF, etc.), and can use the buffer in various ways
Consumer = imports the handle, and can use the buffer in various ways
*** Producer-consumer interaction. ***
1) On handle export, the producer receives these flags:
- READ, WRITE, READ+WRITE: Describe the expected usage in the consumer.
* The producer decides if it needs to disable compression based on
those flags.
- EXPLICIT_FLUSH flag: Meaning that the producer will explicitly
receive a "flush_resource" call before the consumer starts using the
buffer. This is a hint that the producer doesn't have to keep track of
"when to do decompression" when sharing the buffer with the consumer.
2) Passing metadata (tiling, pixel ordering, format, layout) info
between the producer and consumer:
- All AMDGPU buffer/texture allocations have 256 bytes (64 dwords) of
internal per-allocation metadata storage that lives in the kernel
space. There are amdgpu-specific ioctls that can "set" and "get" the
metadata. Any process that has a buffer handle can do that.
* The produces writes the metadata, the consumer reads it.
- The producer-consumer interop API doesn't know about the metadata.
All you need to pass around is a buffer handle. (KMS, DMABUF, etc.)
* There was a note during the talk that DMABUF doesn't have any
metadata. Well, I just told you that it has, but it's private to
amdgpu and possibly accessible to other kernel drivers too.
OK. I believe someone pointed this out during my talk or afterwards as
well. Some drivers are using this method, but there seems to be some
debate over whether this is the preferred general design. Others have
told me this isn't the right mechanism to store this sort of metadata,
but I'm not familiar with the specific counter arguments.
* We can build upon this idea. I think the worst thing to do would
be to add metadata handling to driver-agnostic userspace APIs. Really,
driver-agnostic APIs shouldn't know about that, because they can't
understand all the hw-specific information encoded in the metadata.
Also, when you want to change the metadata format, you only have to
update the affected drivers, not userspace APIs.
How does this kernel-side metadata interact with userspace driver
suballocation, or application-managed suballocation in APIs such as Vulkan?
Thanks,
-James
3) Internal AMDGPU metadata storage format
- The header contains: Vendor ID, PCI ID, and version number.
- The header is followed by PCI-ID-specific data. The PCI ID and the
version number define the format.
- If the consumer runs on a different device, it must read the header
and parse the metadata based on that. It implies that the
driver-specific consumer code needs to know about all potential
producer devices.
Bottom line: DMABUF handles alone are fully sufficient for sharing
buffers/textures between devices and processes from the AMDGPU point
of view.
HW driver implementation: The driver doesn't know anything about the
users of exported or imported buffers. It only acts based on the few
flags described in section 1. So far that's all we've needed.
*** Use cases ***
1) DRI (producer: application; consumer: X server)
- The producer receives these flags: READ, EXPLICIT_FLUSH. The X
server will treat the shared "texture" as read-only. EXPLICIT_FLUSH
ensures the texture can be compressed, and "flush_resource" will be
called as part of SwapBuffers and "glFlush: GL_FRONT".
- The X server can run on a different device. In that case, the window
system API passes the "LINEAR" flag to the driver during allocation.
That's suboptimal and fixable.
2) OpenGL-OpenCL interop (OpenGL always exports handles, OpenCL always
imports handles)
- Possible flags: READ, WRITE, READ+WRITE
- OpenCL doesn't give us any other flags, so we are stuck with those.
- Inter-device sharing is possible if the consumer understands the
producer's metadata and tiling layouts.
(amdgpu actually stores 2 different metadata blocks per allocation,
but the simpler one is too limited and has only 8 bytes)
Marek
On Wed, Oct 5, 2016 at 1:47 AM, James Jones <jajones@xxxxxxxxxx> wrote:
Hello everyone,
As many are aware, we took up the issue of surface/memory allocation at XDC
this year. The outcome of that discussion was the beginnings of a design
proposal for a library that would server as a cross-device, cross-process
surface allocator. In the past week I've started to condense some of my
notes from that discussion down to code & a design document. I've posted
the first pieces to a github repository here:
https://github.com/cubanismo/allocator
This isn't anything close to usable code yet. Just headers and docs, and
incomplete ones at that. However, feel free to check it out if you're
interested in discussing the design.
Thanks,
-James
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel