Re: xe vs amdgpu userptr handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2024-02-08 at 12:08 +0100, Christian König wrote:
> Am 08.02.24 um 10:43 schrieb Thomas Hellström:
> > On Thu, 2024-02-08 at 10:38 +0100, Thomas Hellström wrote:
> > > Hi,
> > > 
> > > On Thu, 2024-02-08 at 07:30 +0100, Christian König wrote:
> > > > Am 08.02.24 um 01:36 schrieb Dave Airlie:
> > > > > Just cc'ing some folks. I've also added another question.
> > > > > 
> > > > > On Wed, 7 Feb 2024 at 21:08, Maíra Canal <mcanal@xxxxxxxxxx>
> > > > > wrote:
> > > > > > Adding another point to this discussion, would it make
> > > > > > sense to
> > > > > > somehow
> > > > > > create a generic structure that all drivers, including
> > > > > > shmem
> > > > > > drivers,
> > > > > > could use it?
> > > > > > 
> > > > > > Best Regards,
> > > > > > - Maíra
> > > > > > 
> > > > > > On 2/7/24 03:56, Dave Airlie wrote:
> > > > > > > I'm just looking over the userptr handling in both
> > > > > > > drivers,
> > > > > > > and
> > > > > > > of
> > > > > > > course they've chosen different ways to represent things.
> > > > > > > Again
> > > > > > > this
> > > > > > > is a divergence that is just going to get more annoying
> > > > > > > over
> > > > > > > time and
> > > > > > > eventually I'd like to make hmm/userptr driver
> > > > > > > independent as
> > > > > > > much as
> > > > > > > possible, so we get consistent semantics in userspace.
> > > > > > > 
> > > > > > > AFAICS the main difference is that amdgpu builds the
> > > > > > > userptr
> > > > > > > handling
> > > > > > > inside a GEM object in the kernel, whereas xe doesn't
> > > > > > > bother
> > > > > > > creating
> > > > > > > a holding object and just handles things directly in the
> > > > > > > VM
> > > > > > > binding
> > > > > > > code.
> > > > > > > 
> > > > > > > Is this just different thinking at different times here?
> > > > > > > like since we have VM BIND in xe, it made sense not to
> > > > > > > bother
> > > > > > > creating
> > > > > > > a gem object for userptrs?
> > > > > > > or is there some other advantages to going one way or the
> > > > > > > other?
> > > > > > > 
> > > > > So the current AMD code uses hmm to do userptr work, but xe
> > > > > doesn't
> > > > > again, why isn't xe using hmm here, I thought I remembered
> > > > > scenarios
> > > > > where plain mmu_notifiers weren't sufficient.
> > > > Well amdgpu is using hmm_range_fault because that was made
> > > > mandatory
> > > > for
> > > > the userptr handling.
> > > > 
> > > > And yes pure mmu_notifiers are not sufficient, you need the
> > > > sequence
> > > > number + retry mechanism hmm_range_fault provides.
> > > > 
> > > > Otherwise you open up security holes you can push an elephant
> > > > through.
> > > > 
> > > > Regards,
> > > > Christian.
> > > Xe also uses a retry mechanism, so no security hole. The main
> > > difference is we use get_user_pages() + retry instead of
> > > hmm_range_fault() + retry, with a net effect we're probably
> > > holding
> > > the
> > > page refcounts a little longer, but we drop it immediately after
> > > obtaining the page pointers, and dirtying the pages.
> > > 
> > > That said we should move over to hmm_range_fault() to align. I
> > > think
> > > this was only a result of limited hmm knowledge when the xe code
> > > was
> > > written.
> 
> Yeah, that makes sense. In this case it's just a missing cleanup.
> 
> > > As for the userptr not using a backing object in Xe, AFAIK that
> > > was
> > > because a backing object was deemed unnecessary with VM_BIND.
> > > Joonas
> > > or
> > > Matt can probably provide more detail, but if we're going to do
> > > an
> > > hmmptr, and have userptr only be a special case of that, then the
> > > object is ofc out of the question.
> 
> Well how do you then store the dma_fence of the operation?
> 
> At least my long term plan was to move some of the logic necessary
> for 
> hmm_range_fault based userptr handling into common GEM code.

Since these are never shared, the dma-fences resulting from both the
bind operations and the workloads using the user-pointers are stored in
the vm's dma-resv, just like for local GEM objects. Although this is
IMHO a bit sub-optimal since I ideally would want the dma-fences
resulting from the bind operations stored per vma. That is because when
zapping ptes in faulting VMs we first need to wait for the bind to
complete, but only that. We don't need to wait for the VM to become
idle.

> 
> One puzzle piece of that is the drm_exec object and for that to work 
> userptrs would need to be based on GEM objects as well.

It does with the above solution. The vm's dma-resv is aliased to a
suitable GEM object.

> 
> > > FWIW i915 also keeps an object, but it also pins the pages and
> > > relies
> > > on the shrinker to release that pinning.
> 
> Well what exactly do you pin? The pages backing the userptr?
> 
> Cause that would be a no-go as well I think since it badly clashes
> with 
> NUMA migrations and transparent huge pages.

Exactly. I guess just nobody dared to remove the pinning in i915 in
favor of the pure mmu_notifier protection to see what errors CI might
come up with. Yet.

/Thomas


> 
> Regards,
> Christian.
> 
> > > 
> > > So yes, some common code would come in handy. From looking at all
> > > implementations I'd
> > > 
> > > - Use hmm_range_fault() - Probably want to temporarily get and
> > > lock
> > > the
> > > pages to dirty them at fault time, though, if gpu mapping is
> > > write-
> > > enabled.
> > > - Don't use a backing object - To be able to unify hmmptr and
> > > userptr
> > Oh, and to clarify for people that haven't been following the gpuvm
> > discussions and the xe SVM discussions,
> > 
> > hmmptr is sparsely populated on-demand allocated (fault aware) and
> > can
> > do migration.
> > userptr is allocated upfront and can't do migration.
> > 
> > Idea has been to create helpers for these in drm_gpuvm().
> > 
> > /Thomas
> > 
> > 
> > 
> > > Thanks,
> > > Thomas
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > > > Dave.
> 





[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux