Re: [PATCH 06/23] drm/xe/svm: Introduce a helper to build sg table from hmm range

Jason Gunthorpe <jgg@xxxxxxxxxx> · Wed, 24 Apr 2024 13:44:11 -0300

On Wed, Apr 24, 2024 at 04:35:17PM +0000, Matthew Brost wrote:
> On Wed, Apr 24, 2024 at 10:57:54AM -0300, Jason Gunthorpe wrote:
> > On Wed, Apr 24, 2024 at 02:31:36AM +0000, Matthew Brost wrote:
> > 
> > > AMD seems to register notifiers on demand for parts of the address space
> > > [1], I think Nvidia's open source driver does this too (can look this up
> > > if needed). We (Intel) also do this in Xe and the i915 for userptrs
> > > (explictly binding a user address via IOCTL) too and it seems to work
> > > quite well.
> > 
> > I always thought AMD's implementation of this stuff was bad..
> 
> No comment on the quality of AMD's implementaion.
> 
> But in general the view among my team members that registering notifiers
> on demand for sub ranges is an accepted practice.

Yes, but not on a 2M granual, and not without sparsity. Do it on
something like an aligned 512M and it would be fairly reasonable.

> You do not *need* some other data structure as you could always just
> walk the page tables but in practice a data structure exists in a tree
> of shorts with the key being a VA range. The data structure has meta
> data about the mapping, all GPU drivers seem to have this. 

What "meta data" is there for a SVA mapping? The entire page table is
an SVA.

> structure, along with pages returned from hmm_range_fault, are used to
> program the GPU PTEs.

Most likely pages returned from hmm_range_fault() can just be stored
directly in the page table's PTEs. I'd be surprised if you actually
need seperate storage. (ignoring some of the current issues with the
DMA API)

> Again the allocation of this data structure happens *before* calling
> hmm_range_fault on first GPU fault within unmapped range.

The SVA page table and hmm_range_fault are tightly connected together,
if a vma is needed to make it work then it is not "before", it is
part of.

Jason