Re: [Intel-gfx] [PATCH v2 3/3] drm/i915/uapi: reject set_domain for discrete

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Friday, July 2, 2021 12:22:58 PM PDT Daniel Vetter wrote:
> On Fri, Jul 02, 2021 at 03:31:08PM +0100, Tvrtko Ursulin wrote:
> > 
> > On 01/07/2021 16:10, Matthew Auld wrote:
> > > The CPU domain should be static for discrete, and on DG1 we don't need
> > > any flushing since everything is already coherent, so really all this
> > 
> > Knowledge of the write combine buffer is assumed to be had by anyone involved?
> > 
> > > does is an object wait, for which we have an ioctl. Longer term the
> > > desired caching should be an immutable creation time property for the
> > > BO, which can be set with something like gem_create_ext.
> > > 
> > > One other user is iris + userptr, which uses the set_domain to probe all
> > > the pages to check if the GUP succeeds, however keeping the set_domain
> > > around just for that seems rather scuffed. We could equally just submit
> > > a dummy batch, which should hopefully be good enough, otherwise adding a
> > > new creation time flag for userptr might be an option. Although longer
> > > term we will also have vm_bind, which should also be a nice fit for
> > > this, so adding a whole new flag is likely overkill.
> > 
> > Execbuf sounds horrible. But it all reminds me of past work by Chris which is surprisingly hard to find in the archives. Patches like:
> > 
> > commit 7706a433388016983052a27c0fd74a64b1897ae7
> > Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > Date:   Wed Nov 8 17:04:07 2017 +0000
> > 
> >     drm/i915/userptr: Probe existence of backing struct pages upon creation
> >     Jason Ekstrand requested a more efficient method than userptr+set-domain
> >     to determine if the userptr object was backed by a complete set of pages
> >     upon creation. To be more efficient than simply populating the userptr
> >     using get_user_pages() (as done by the call to set-domain or execbuf),
> >     we can walk the tree of vm_area_struct and check for gaps or vma not
> >     backed by struct page (VM_PFNMAP). The question is how to handle
> >     VM_MIXEDMAP which may be either struct page or pfn backed...
> > 
> > commit 7ca21d3390eec23db99b8131ed18bc036efaba18
> > Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > Date:   Wed Nov 8 17:48:22 2017 +0000
> > 
> >     drm/i915/userptr: Add a flag to populate the userptr on creation
> >     Acquiring the backing struct pages for the userptr range is not free;
> >     the first client for userptr would insist on frequently creating userptr
> >     objects ahead of time and not use them. For that first client, deferring
> >     the cost of populating the userptr (calling get_user_pages()) to the
> >     actual execbuf was a substantial improvement. However, not all clients
> >     are the same, and most would like to validate that the userptr is valid
> >     and backed by struct pages upon creation, so offer a
> >     I915_USERPTR_POPULATE flag to do just that.
> >     Note that big difference between I915_USERPTR_POPULATE and the deferred
> >     scheme is that POPULATE is guaranteed to be synchronous, the result is
> >     known before the ioctl returns (and the handle exposed). However, due to
> >     system memory pressure, the object may be paged out before use,
> >     requiring them to be paged back in on execbuf (as may always happen).
> > 
> > At least with the first one I think I was skeptical, since probing at
> > point A makes a weak test versus userptr getting used at point B.
> > Populate is kind of same really when user controls the backing store. At
> > least these two arguments I think stand if we are trying to sell these
> > flags as validation. But if the idea is limited to pure preload, with no
> > guarantees that it keeps working by time of real use, then I guess it
> > may be passable.
> 
> Well we've thrown this out again because there was no userspace. But if
> this is requested by mesa, then the _PROBE flag should be entirely
> sufficient.
> 
> Since I don't want to hold up dg1 pciids on this it'd be nice if we could
> just go ahead with the dummy batch, if Ken/Jordan don't object - iris is
> the only umd that needs this.

I really would rather not have to submit a dummy batchbuffer.

The GL_AMD_pinned_memory extension requires throwing an error when
performing the initial userptr import if "the store cannot be mapped to
the GPU address space".  So this is not a weird thing that iris is
doing, it's part of the actual API we're implementing.

Today, I can use SET_DOMAIN which is almost no code.  In the future,
I'll have VM_BIND, which also makes sense.  In the meantime, this is
taking away my easy implementation for a bunch of code that I have to
keep supporting forever.

>From the point of view of having a clean API...

- Using SET_DOMAIN is clearly a hack.  It works, but we're not intending
  to do anything with cache domains.  Dropping this API is a good plan.

- Passing a flag to USERPTR that says "please actually make sure it
  works" seems entirely reasonable to have as part of the API, and
  matches our userspace API.

- Using VM_BIND would also make sense and seems reasonable.

- Having to construct an entire batch and submit it for actual execution
  on the GPU just to check for an error case seems like awful design IMO.
  Error checking buffers is not what execbuf is for.  And it's not
  simple to use, either.

I checked with Jason and I believe he also prefers having a new flag on
the userptr ioctl.

--Ken

Attachment: signature.asc
Description: This is a digitally signed message part.


[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux