Re: xe vs amdgpu userptr handling

Christian König <christian.koenig@xxxxxxx> · Thu, 8 Feb 2024 12:08:49 +0100

Am 08.02.24 um 10:43 schrieb Thomas Hellström:
On Thu, 2024-02-08 at 10:38 +0100, Thomas Hellström wrote:
Hi,

On Thu, 2024-02-08 at 07:30 +0100, Christian König wrote:
Am 08.02.24 um 01:36 schrieb Dave Airlie:
Just cc'ing some folks. I've also added another question.

On Wed, 7 Feb 2024 at 21:08, Maíra Canal <mcanal@xxxxxxxxxx>
wrote:
Adding another point to this discussion, would it make sense to
somehow
create a generic structure that all drivers, including shmem
drivers,
could use it?

Best Regards,
- Maíra

On 2/7/24 03:56, Dave Airlie wrote:
I'm just looking over the userptr handling in both drivers,
and
of
course they've chosen different ways to represent things.
Again
this
is a divergence that is just going to get more annoying over
time and
eventually I'd like to make hmm/userptr driver independent as
much as
possible, so we get consistent semantics in userspace.

AFAICS the main difference is that amdgpu builds the userptr
handling
inside a GEM object in the kernel, whereas xe doesn't bother
creating
a holding object and just handles things directly in the VM
binding
code.

Is this just different thinking at different times here?
like since we have VM BIND in xe, it made sense not to bother
creating
a gem object for userptrs?
or is there some other advantages to going one way or the
other?

So the current AMD code uses hmm to do userptr work, but xe
doesn't
again, why isn't xe using hmm here, I thought I remembered
scenarios
where plain mmu_notifiers weren't sufficient.
Well amdgpu is using hmm_range_fault because that was made
mandatory
for
the userptr handling.

And yes pure mmu_notifiers are not sufficient, you need the
sequence
number + retry mechanism hmm_range_fault provides.

Otherwise you open up security holes you can push an elephant
through.

Regards,
Christian.
Xe also uses a retry mechanism, so no security hole. The main
difference is we use get_user_pages() + retry instead of
hmm_range_fault() + retry, with a net effect we're probably holding
the
page refcounts a little longer, but we drop it immediately after
obtaining the page pointers, and dirtying the pages.

That said we should move over to hmm_range_fault() to align. I think
this was only a result of limited hmm knowledge when the xe code was
written.

Yeah, that makes sense. In this case it's just a missing cleanup.

As for the userptr not using a backing object in Xe, AFAIK that was
because a backing object was deemed unnecessary with VM_BIND. Joonas
or
Matt can probably provide more detail, but if we're going to do an
hmmptr, and have userptr only be a special case of that, then the
object is ofc out of the question.

Well how do you then store the dma_fence of the operation?

At least my long term plan was to move some of the logic necessary for 
hmm_range_fault based userptr handling into common GEM code.

One puzzle piece of that is the drm_exec object and for that to work 
userptrs would need to be based on GEM objects as well.

FWIW i915 also keeps an object, but it also pins the pages and relies
on the shrinker to release that pinning.

Well what exactly do you pin? The pages backing the userptr?

Cause that would be a no-go as well I think since it badly clashes with 
NUMA migrations and transparent huge pages.

Regards,
Christian.

So yes, some common code would come in handy. From looking at all
implementations I'd

- Use hmm_range_fault() - Probably want to temporarily get and lock
the
pages to dirty them at fault time, though, if gpu mapping is write-
enabled.
- Don't use a backing object - To be able to unify hmmptr and userptr
Oh, and to clarify for people that haven't been following the gpuvm
discussions and the xe SVM discussions,

hmmptr is sparsely populated on-demand allocated (fault aware) and can
do migration.
userptr is allocated upfront and can't do migration.

Idea has been to create helpers for these in drm_gpuvm().

/Thomas

Thanks,
Thomas

Dave.