Re: [Intel-gfx] [PATCH 3/3] drm/doc/rfc: VM_BIND uapi definition

Matthew Brost <matthew.brost@xxxxxxxxx> · Mon, 13 Jun 2022 16:39:47 -0700

On Mon, Jun 13, 2022 at 07:09:06PM +0100, Tvrtko Ursulin wrote:
> 
> On 13/06/2022 18:49, Niranjana Vishwanathapura wrote:
> > On Mon, Jun 13, 2022 at 05:22:02PM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 13/06/2022 16:05, Niranjana Vishwanathapura wrote:
> > > > On Mon, Jun 13, 2022 at 09:24:18AM +0100, Tvrtko Ursulin wrote:
> > > > > 
> > > > > On 10/06/2022 17:14, Niranjana Vishwanathapura wrote:
> > > > > > On Fri, Jun 10, 2022 at 05:48:39PM +0300, Lionel Landwerlin wrote:
> > > > > > > On 10/06/2022 13:37, Tvrtko Ursulin wrote:
> > > > > > > > 
> > > > > > > > On 10/06/2022 08:07, Niranjana Vishwanathapura wrote:
> > > > > > > > > VM_BIND and related uapi definitions
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Niranjana Vishwanathapura
> > > > > > > > > <niranjana.vishwanathapura@xxxxxxxxx>
> > > > > > > > > ---
> > > > > > > > >   Documentation/gpu/rfc/i915_vm_bind.h | 490
> > > > > > > > > +++++++++++++++++++++++++++
> > > > > > > > >   1 file changed, 490 insertions(+)
> > > > > > > > >   create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h
> > > > > > > > > 
> > > > > > > > > diff --git
> > > > > > > > > a/Documentation/gpu/rfc/i915_vm_bind.h
> > > > > > > > > b/Documentation/gpu/rfc/i915_vm_bind.h
> > > > > > > > > new file mode 100644
> > > > > > > > > index 000000000000..9fc854969cfb
> > > > > > > > > --- /dev/null
> > > > > > > > > +++ b/Documentation/gpu/rfc/i915_vm_bind.h
> > > > > > > > > @@ -0,0 +1,490 @@
> > > > > > > > > +/* SPDX-License-Identifier: MIT */
> > > > > > > > > +/*
> > > > > > > > > + * Copyright © 2022 Intel Corporation
> > > > > > > > > + */
> > > > > > > > > +
> > > > > > > > > +/**
> > > > > > > > > + * DOC: I915_PARAM_HAS_VM_BIND
> > > > > > > > > + *
> > > > > > > > > + * VM_BIND feature availability.
> > > > > > > > > + * See typedef drm_i915_getparam_t param.
> > > > > > > > > + * bit[0]: If set, VM_BIND is supported, otherwise not.
> > > > > > > > > + * bits[8-15]: VM_BIND implementation version.
> > > > > > > > > + * version 0 will not have VM_BIND/UNBIND
> > > > > > > > > timeline fence array support.
> > > > > > > > > + */
> > > > > > > > > +#define I915_PARAM_HAS_VM_BIND        57
> > > > > > > > > +
> > > > > > > > > +/**
> > > > > > > > > + * DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
> > > > > > > > > + *
> > > > > > > > > + * Flag to opt-in for VM_BIND mode of binding during VM creation.
> > > > > > > > > + * See struct drm_i915_gem_vm_control flags.
> > > > > > > > > + *
> > > > > > > > > + * The older execbuf2 ioctl will not
> > > > > > > > > support VM_BIND mode of operation.
> > > > > > > > > + * For VM_BIND mode, we have new execbuf3
> > > > > > > > > ioctl which will not accept any
> > > > > > > > > + * execlist (See struct
> > > > > > > > > drm_i915_gem_execbuffer3 for more details).
> > > > > > > > > + *
> > > > > > > > > + */
> > > > > > > > > +#define I915_VM_CREATE_FLAGS_USE_VM_BIND    (1 << 0)
> > > > > > > > > +
> > > > > > > > > +/**
> > > > > > > > > + * DOC: I915_CONTEXT_CREATE_FLAGS_LONG_RUNNING
> > > > > > > > > + *
> > > > > > > > > + * Flag to declare context as long running.
> > > > > > > > > + * See struct drm_i915_gem_context_create_ext flags.
> > > > > > > > > + *
> > > > > > > > > + * Usage of dma-fence expects that they
> > > > > > > > > complete in reasonable amount of time.
> > > > > > > > > + * Compute on the other hand can be long
> > > > > > > > > running. Hence it is not appropriate
> > > > > > > > > + * for compute contexts to export request
> > > > > > > > > completion dma-fence to user.
> > > > > > > > > + * The dma-fence usage will be limited to
> > > > > > > > > in-kernel consumption only.
> > > > > > > > > + * Compute contexts need to use user/memory fence.
> > > > > > > > > + *
> > > > > > > > > + * So, long running contexts do not support output fences. Hence,
> > > > > > > > > + * I915_EXEC_FENCE_SIGNAL (See
> > > > > > > > > &drm_i915_gem_exec_fence.flags) is expected
> > > > > > > > > + * to be not used. DRM_I915_GEM_WAIT ioctl
> > > > > > > > > call is also not supported for
> > > > > > > > > + * objects mapped to long running contexts.
> > > > > > > > > + */
> > > > > > > > > +#define I915_CONTEXT_CREATE_FLAGS_LONG_RUNNING   (1u << 2)
> > > > > > > > > +
> > > > > > > > > +/* VM_BIND related ioctls */
> > > > > > > > > +#define DRM_I915_GEM_VM_BIND        0x3d
> > > > > > > > > +#define DRM_I915_GEM_VM_UNBIND        0x3e
> > > > > > > > > +#define DRM_I915_GEM_EXECBUFFER3    0x3f
> > > > > > > > > +#define DRM_I915_GEM_WAIT_USER_FENCE    0x40
> > > > > > > > > +
> > > > > > > > > +#define DRM_IOCTL_I915_GEM_VM_BIND
> > > > > > > > > DRM_IOWR(DRM_COMMAND_BASE +
> > > > > > > > > DRM_I915_GEM_VM_BIND, struct
> > > > > > > > > drm_i915_gem_vm_bind)
> > > > > > > > > +#define DRM_IOCTL_I915_GEM_VM_UNBIND
> > > > > > > > > DRM_IOWR(DRM_COMMAND_BASE +
> > > > > > > > > DRM_I915_GEM_VM_UNBIND, struct
> > > > > > > > > drm_i915_gem_vm_bind)
> > > > > > > > > +#define DRM_IOCTL_I915_GEM_EXECBUFFER3
> > > > > > > > > DRM_IOWR(DRM_COMMAND_BASE +
> > > > > > > > > DRM_I915_GEM_EXECBUFFER3, struct
> > > > > > > > > drm_i915_gem_execbuffer3)
> > > > > > > > > +#define DRM_IOCTL_I915_GEM_WAIT_USER_FENCE
> > > > > > > > > DRM_IOWR(DRM_COMMAND_BASE +
> > > > > > > > > DRM_I915_GEM_WAIT_USER_FENCE, struct
> > > > > > > > > drm_i915_gem_wait_user_fence)
> > > > > > > > > +
> > > > > > > > > +/**
> > > > > > > > > + * struct drm_i915_gem_vm_bind - VA to object mapping to bind.
> > > > > > > > > + *
> > > > > > > > > + * This structure is passed to VM_BIND
> > > > > > > > > ioctl and specifies the mapping of GPU
> > > > > > > > > + * virtual address (VA) range to the
> > > > > > > > > section of an object that should be bound
> > > > > > > > > + * in the device page table of the specified address space (VM).
> > > > > > > > > + * The VA range specified must be unique
> > > > > > > > > (ie., not currently bound) and can
> > > > > > > > > + * be mapped to whole object or a section
> > > > > > > > > of the object (partial binding).
> > > > > > > > > + * Multiple VA mappings can be created to
> > > > > > > > > the same section of the object
> > > > > > > > > + * (aliasing).
> > > > > > > > > + *
> > > > > > > > > + * The @queue_idx specifies the queue to
> > > > > > > > > use for binding. Same queue can be
> > > > > > > > > + * used for both VM_BIND and VM_UNBIND
> > > > > > > > > calls. All submitted bind and unbind
> > > > > > > > > + * operations in a queue are performed in the order of submission.
> > > > > > > > > + *
> > > > > > > > > + * The @start, @offset and @length should
> > > > > > > > > be 4K page aligned. However the DG2
> > > > > > > > > + * and XEHPSDV has 64K page size for device
> > > > > > > > > local-memory and has compact page
> > > > > > > > > + * table. On those platforms, for binding
> > > > > > > > > device local-memory objects, the
> > > > > > > > > + * @start should be 2M aligned, @offset and
> > > > > > > > > @length should be 64K aligned.
> > > > > > > > > + * Also, on those platforms, it is not
> > > > > > > > > allowed to bind an device local-memory
> > > > > > > > > + * object and a system memory object in a
> > > > > > > > > single 2M section of VA range.
> > > > > > > > > + */
> > > > > > > > > +struct drm_i915_gem_vm_bind {
> > > > > > > > > +    /** @vm_id: VM (address space) id to bind */
> > > > > > > > > +    __u32 vm_id;
> > > > > > > > > +
> > > > > > > > > +    /** @queue_idx: Index of queue for binding */
> > > > > > > > > +    __u32 queue_idx;
> > > > > > > > 
> > > > > > > > I have a question here to which I did not find
> > > > > > > > an answer by browsing the old threads.
> > > > > > > > 
> > > > > > > > Queue index appears to be an implicit
> > > > > > > > synchronisation mechanism, right? Operations on
> > > > > > > > the same index are executed/complete in order of
> > > > > > > > ioctl submission?
> > > > > > > > 
> > > > > > > > Do we _have_ to implement this on the kernel
> > > > > > > > side and could just allow in/out fence and let
> > > > > > > > userspace deal with it?
> > > > > > > 
> > > > > > > 
> > > > > > > It orders operations like in a queue. Which is kind
> > > > > > > of what happens with existing queues/engines.
> > > > > > > 
> > > > > > > If I understood correctly, it's going to be a
> > > > > > > kthread + a linked list right?
> > > > > > > 
> > > > > > 
> > > > > > Yes, that is correct.
> > > > > > 
> > > > > > > 
> > > > > > > -Lionel
> > > > > > > 
> > > > > > > 
> > > > > > > > 
> > > > > > > > Arbitrary/on-demand number of queues will add
> > > > > > > > the complexity on the kernel side which should
> > > > > > > > be avoided if possible.
> > > > > > > > 
> > > > > > 
> > > > > > It was discussed in the other thread. Jason prefers this over putting
> > > > > > an artificial limit on number of queues (as user can
> > > > > > anyway can exhaust
> > > > > > the memory). I think complexity in the driver is manageable.
> > > > > 
> > > > > You'll need to create tracking structures on demand, with
> > > > > atomic replace of last fence, ref counting and locking of
> > > > > some sort, more or less?
> > > > > 
> > > > 
> > > > We will have a workqueue, an work item and a linked list per queue.
> > > > VM_BIND/UNBIND call will add the mapping request to the
> > > > specified queue's
> > > > linked list and schedule the work item on the workqueue of that queue.
> > > > I am not sure what you mean by last fence and replacing it.
> > > > 
> > > > > > The other option being discussed in to have the user create those
> > > > > > queues (like creating engine map) before hand and use that in vm_bind
> > > > > > and vm_unbind ioctls. This puts a limit on the number of queues.
> > > > > > But it is not clean either and not sure it is worth
> > > > > > making the interface
> > > > > > more complex.
> > > > > > https://www.spinics.net/lists/dri-devel/msg350448.html
> > > > > 
> > > > > What about the third option of a flag to return a fence (of
> > > > > some sort) and pass in a fence? That way userspace can
> > > > > imagine zero or N queues with very little effort on the
> > > > > kernel side. Was this considered?
> > > > > 
> > > > 
> > > > I am not clear what fence you are talking about here and how does that
> > > > help with the number of vm_bind queues. Can you eloborate?
> > > 
> > > It is actually already documented that bind/unbind will support
> > > input and output fences - so what are these queues on top of what
> > > userspace can already achieve by using them? Purely a convenience or
> > > there is more to it?
> > > 
> > 
> > Oh, the vm_bind queues are discussed in this thread.
> > https://lists.freedesktop.org/archives/intel-gfx/2022-June/299217.html
> > 
> > Apparently Vulkan has requirement for multiple queues, each queue
> > processing vm_bind/unbind calls in the order of submission.
> 
> I don't see how that answers my question so I will take the freedom to
> repeat it. What are these queues on top of what userspace can already
> achieve by using in-out fences? Purely a convenience or there is more to it?
> 
> Queue1:
> 
> out_fence_A = vm_bind A
> out_fence_B = vm_bind B, in_fence=out_fence_A
> execbuf(in_fence = out_fence_B)
> 
> Queue2:
> 
> out_fence_C = vm_bind C
> out_fence_D = vm_bind D, in_fence=out_fence_C
> execbuf(in_fence = out_fence_D)
> 
> Parallel bind:
> out_fence_E = vm_bind E
> out_fence_F = vm_bind F
> merged_fence = fence_merge(out_fence_E, out_fence_F)
> execbuf(in_fence = merged_fence)
> 

Let's say you do this and only 1 queue:

VM_BIND_A (in_fence=fence_A)
VM_BIND_B (in_fence=NULL)

With 1 queue VM_BIND_B in blocked on fence_A, hence the need for than 1
queue.

e.g.

VM_BIND_A (queue_id=0, in_fence=fence_A)
VM_BIND_B (queue_id=1, in_fence=NULL)

Now VM_BIND_B can immediately be executed regardless of fence_A status.

Matt

> Regards,
> 
> Tvrtko