On 1/18/23 20:48, Christian König wrote:
Am 18.01.23 um 20:17 schrieb Dave Airlie:
On Thu, 19 Jan 2023 at 02:54, Alex Deucher <alexdeucher@xxxxxxxxx> wrote:
On Wed, Jan 18, 2023 at 11:50 AM Danilo Krummrich <dakr@xxxxxxxxxx>
wrote:
On 1/18/23 17:30, Alex Deucher wrote:
On Wed, Jan 18, 2023 at 11:19 AM Danilo Krummrich <dakr@xxxxxxxxxx>
wrote:
On 1/18/23 16:37, Christian König wrote:
Am 18.01.23 um 16:34 schrieb Danilo Krummrich:
Hi Christian,
On 1/18/23 09:53, Christian König wrote:
Am 18.01.23 um 07:12 schrieb Danilo Krummrich:
This patch series provides a new UAPI for the Nouveau driver in
order to
support Vulkan features, such as sparse bindings and sparse
residency.
Furthermore, with the DRM GPUVA manager it provides a new DRM
core
feature to
keep track of GPU virtual address (VA) mappings in a more
generic way.
The DRM GPUVA manager is indented to help drivers implement
userspace-manageable
GPU VA spaces in reference to the Vulkan API. In order to achieve
this goal it
serves the following purposes in this context.
1) Provide a dedicated range allocator to track GPU VA
allocations and
mappings, making use of the drm_mm range allocator.
This means that the ranges are allocated by the kernel? If yes
that's
a really really bad idea.
No, it's just for keeping track of the ranges userspace has
allocated.
Ok, that makes more sense.
So basically you have an IOCTL which asks kernel for a free
range? Or
what exactly is the drm_mm used for here?
Not even that, userspace provides both the base address and the
range,
the kernel really just keeps track of things. Though, writing a
UAPI on
top of the GPUVA manager asking for a free range instead would be
possible by just adding the corresponding wrapper functions to get a
free hole.
Currently, and that's what I think I read out of your question,
the main
benefit of using drm_mm over simply stuffing the entries into a
list or
something boils down to easier collision detection and iterating
sub-ranges of the whole VA space.
Why not just do this in userspace? We have a range manager in
libdrm_amdgpu that you could lift out into libdrm or some other
helper.
The kernel still needs to keep track of the mappings within the various
VA spaces, e.g. it silently needs to unmap mappings that are backed by
BOs that get evicted and remap them once they're validated (or swapped
back in).
Ok, you are just using this for maintaining the GPU VM space in the
kernel.
Yes the idea behind having common code wrapping drm_mm for this is to
allow us to make the rules consistent across drivers.
Userspace (generally Vulkan, some compute) has interfaces that pretty
much dictate a lot of how VMA tracking works, esp around lifetimes,
sparse mappings and splitting/merging underlying page tables, I'd
really like this to be more consistent across drivers, because already
I think we've seen with freedreno some divergence from amdgpu and we
also have i915/xe to deal with. I'd like to at least have one place
that we can say this is how it should work, since this is something
that *should* be consistent across drivers mostly, as it is more about
how the uapi is exposed.
That's a really good idea, but the implementation with drm_mm won't work
like that.
We have Vulkan applications which use the sparse feature to create
literally millions of mappings. That's why I have fine tuned the mapping
structure in amdgpu down to ~80 bytes IIRC and save every CPU cycle
possible in the handling of that.
That's a valuable information. Can you recommend such an application for
testing / benchmarking?
Your optimization effort sounds great. May it be worth thinking about
generalizing your approach by itself and stacking the drm_gpuva_manager
on top of it?
A drm_mm_node is more in the range of ~200 bytes and certainly not
suitable for this kind of job.
I strongly suggest to rather use a good bunch of the amdgpu VM code as
blueprint for the common infrastructure.
I will definitely have look.
Regards,
Christian.
Dave.