Re: [PATCH v2 6/7] drm/panfrost: Add support for GPU heap allocations

Steven Price <steven.price@xxxxxxx> · Fri, 26 Jul 2019 11:43:00 +0100

On 25/07/2019 18:40, Alyssa Rosenzweig wrote:
Sorry, I was being sloppy again![1] I meant CPU mmapped.

No worries, just wanted to check :)

Apparently the blob in some cases creates a SAME_VA GROW_ON_GPF buffer -
since SAME_VA means permanently mapped on the CPU this translated to
mmapping a HEAP object. Why it does this I've no idea.

I'm not sure I follow. Conceptually, if you're permanently mapped,
there's nothing to grow, right? Is there a reason not to just disable
HEAP in this cases, i.e.:

	if (flags & SAME_VA)
		flags &= ~GROW_ON_GPF;

It may not be fully optimal, but that way the legacy code keeps working
and upstream userspace isn't held back :)

Yes, that's my hack at the moment and it works. It looks like the driver 
might be allocated a depth or stencil buffer without knowing whether it 
will be used. The buffer is then "grown" if it is needed by the GPU. The 
problem is it is possible for the application to access it later.

The main use in the blob for
this is being able to dump buffers when debugging (i.e. dump buffers
before/after every GPU job).

Could we disable HEAP support in userspace (not setting the flags) for
debug builds that need to dump buffers? In production the extra memory
usage matters, hence this patch, but in dev, there's plenty of memory to
spare.

Ideally you also need a way of querying which pages have been backed
by faults (much easier with kbase where that's always just the number
of pages).

Is there a use case for this with one of the userland APIs? (Maybe
Vulkan?)

I'm not aware of OpenGL(ES) APIs that expose functionality like this. 
But e.g. allocating a buffer ahead of time for depth/stencil "just in 
case" would need something like this because you may need CPU access to it.

Vulkan has the concept of "sparse" bindings/residency. As far as I'm 
aware there's no requirement that memory is allocated on demand, but a 
page-by-page approach to populating memory is expected. There's quite a 
bit of complexity and the actual way this is represented on the GPU 
doesn't necessarily match the user visible API. Also I believe it's an 
optional feature.

Panfrost, of course, doesn't yet have a good mechanism for supporting 
anything like SAME_VA. My hack so far is to keep allocating BOs until it 
happens to land at an address currently unused in user space.

OpenCL does require something like SAME_VA ("Shared Virtual Memory" or 
SVM). This is apparently useful because the same pointer can be used on 
both CPU and GPU.

I can see two approaches for integrating that:

* Use HMM: CPU VA==GPU VA. This nicely solves the problem, but falls 
over badly when the GPU VA size is smaller than the user space VA size - 
which is sadly true on many 64 bit integrations.

* Provide an allocation flag which causes the kernel driver to not pick 
a GPU address until the buffer is mapped on the CPU. The mmap callback 
would then need to look for a region that is free on both the CPU and GPU.

The second is obviously most similar to the kbase approach. kbase 
simplifies things because the kernel driver has the ultimate say over 
whether the buffer is SAME_VA or not. So on 64 bit user space we upgrade 
everything to be SAME_VA - which means the GPU VA space just follows the 
CPU VA (similar to HMM).

Steve
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel