On Thu, 2023-09-14 at 19:25 +0200, Danilo Krummrich wrote: > On 9/14/23 19:21, Thomas Hellström wrote: > > On Thu, 2023-09-14 at 18:36 +0200, Danilo Krummrich wrote: > > > On 9/14/23 15:48, Thomas Hellström wrote: > > > > Hi, Danilo > > > > > > > > Some additional minor comments as xe conversion progresses. > > > > > > > > On 9/9/23 17:31, Danilo Krummrich wrote: > > > > > So far the DRM GPUVA manager offers common infrastructure to > > > > > track GPU VA > > > > > allocations and mappings, generically connect GPU VA mappings > > > > > to > > > > > their > > > > > backing buffers and perform more complex mapping operations > > > > > on > > > > > the GPU VA > > > > > space. > > > > > > > > > > However, there are more design patterns commonly used by > > > > > drivers, > > > > > which > > > > > can potentially be generalized in order to make the DRM GPUVA > > > > > manager > > > > > represent a basic GPU-VM implementation. In this context, > > > > > this > > > > > patch aims > > > > > at generalizing the following elements. > > > > > > > > > > 1) Provide a common dma-resv for GEM objects not being used > > > > > outside of > > > > > this GPU-VM. > > > > > > > > > > 2) Provide tracking of external GEM objects (GEM objects > > > > > which > > > > > are > > > > > shared with other GPU-VMs). > > > > > > > > > > 3) Provide functions to efficiently lock all GEM objects dma- > > > > > resv > > > > > the > > > > > GPU-VM contains mappings of. > > > > > > > > > > 4) Provide tracking of evicted GEM objects the GPU-VM > > > > > contains > > > > > mappings > > > > > of, such that validation of evicted GEM objects is > > > > > accelerated. > > > > > > > > > > 5) Provide some convinience functions for common patterns. > > > > > > > > > > Rather than being designed as a "framework", the target is to > > > > > make all > > > > > features appear as a collection of optional helper functions, > > > > > such that > > > > > drivers are free to make use of the DRM GPUVA managers basic > > > > > functionality and opt-in for other features without setting > > > > > any > > > > > feature > > > > > flags, just by making use of the corresponding functions. > > > > > > > > > > Big kudos to Boris Brezillon for his help to figure out > > > > > locking > > > > > for drivers > > > > > updating the GPU VA space within the fence signalling path. > > > > > > > > > > Suggested-by: Matthew Brost <matthew.brost@xxxxxxxxx> > > > > > Signed-off-by: Danilo Krummrich <dakr@xxxxxxxxxx> > > > > > --- > > > > > > > > > > +/** > > > > > + * drm_gpuvm_bo_evict() - add / remove a &drm_gem_object to > > > > > / > > > > > from a > > > > > + * &drm_gpuvms evicted list > > > > > + * @obj: the &drm_gem_object to add or remove > > > > > + * @evict: indicates whether the object is evicted > > > > > + * > > > > > + * Adds a &drm_gem_object to or removes it from all > > > > > &drm_gpuvms > > > > > evicted > > > > > + * list containing a mapping of this &drm_gem_object. > > > > > + */ > > > > > +void > > > > > +drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool evict) > > > > > +{ > > > > > + struct drm_gpuvm_bo *vm_bo; > > > > > + > > > > > + drm_gem_for_each_gpuvm_bo(vm_bo, obj) { > > > > > + if (evict) > > > > > + drm_gpuvm_bo_list_add(vm_bo, evict); > > > > > + else > > > > > + drm_gpuvm_bo_list_del(vm_bo, evict); > > > > > + } > > > > > +} > > > > > +EXPORT_SYMBOL_GPL(drm_gpuvm_bo_evict); > > > > > + > > > > > > > > We need a drm_gpuvm_bo_evict(struct drm_gpuvm_bo *vm_bo, ...) > > > > that > > > > puts a single gpuvm_bo on the list, the above function could > > > > perhaps be renamed as drm_gpuvm_gem_obj_evict(obj, ....). > > > > > > Makes sense - gonna change that. > > > > > > > > > > > Reason is some vm's are faulting vms which don't have an evict > > > > list, but validate from the pagefault handler. Also evict == > > > > false > > > > is dangerous because if called from within an exec, it might > > > > remove > > > > the obj from other vm's evict list before they've had a chance > > > > to > > > > rebind their VMAs. > > > > > > > > > static int > > > > > __drm_gpuva_insert(struct drm_gpuvm *gpuvm, > > > > > struct drm_gpuva *va) > > > > > diff --git a/include/drm/drm_gpuvm.h > > > > > b/include/drm/drm_gpuvm.h > > > > > index afa50b9059a2..834bb6d6617e 100644 > > > > > --- a/include/drm/drm_gpuvm.h > > > > > +++ b/include/drm/drm_gpuvm.h > > > > > @@ -26,10 +26,12 @@ > > > > > */ > > > > > #include <linux/list.h> > > > > > +#include <linux/dma-resv.h> > > > > > #include <linux/rbtree.h> > > > > > #include <linux/types.h> > > > > > #include <drm/drm_gem.h> > > > > > +#include <drm/drm_exec.h> > > > > > struct drm_gpuvm; > > > > > struct drm_gpuvm_bo; > > > > > @@ -259,6 +261,38 @@ struct drm_gpuvm { > > > > > * space > > > > > */ > > > > > struct dma_resv *resv; > > > > > + > > > > > + /** > > > > > + * @extobj: structure holding the extobj list > > > > > + */ > > > > > + struct { > > > > > + /** > > > > > + * @list: &list_head storing &drm_gpuvm_bos serving > > > > > as > > > > > + * external object > > > > > + */ > > > > > + struct list_head list; > > > > > + > > > > > + /** > > > > > + * @lock: spinlock to protect the extobj list > > > > > + */ > > > > > + spinlock_t lock; > > > > > + } extobj; > > > > > + > > > > > + /** > > > > > + * @evict: structure holding the evict list and evict > > > > > list > > > > > lock > > > > > + */ > > > > > + struct { > > > > > + /** > > > > > + * @list: &list_head storing &drm_gpuvm_bos > > > > > currently > > > > > being > > > > > + * evicted > > > > > + */ > > > > > + struct list_head list; > > > > > + > > > > > + /** > > > > > + * @lock: spinlock to protect the evict list > > > > > + */ > > > > > + spinlock_t lock; > > > > > + } evict; > > > > > }; > > > > > void drm_gpuvm_init(struct drm_gpuvm *gpuvm, struct > > > > > drm_device > > > > > *drm, > > > > > @@ -268,6 +302,21 @@ void drm_gpuvm_init(struct drm_gpuvm > > > > > *gpuvm, > > > > > struct drm_device *drm, > > > > > const struct drm_gpuvm_ops *ops); > > > > > void drm_gpuvm_destroy(struct drm_gpuvm *gpuvm); > > > > > +/** > > > > > + * drm_gpuvm_is_extobj() - indicates whether the given > > > > > &drm_gem_object is an > > > > > + * external object > > > > > + * @gpuvm: the &drm_gpuvm to check > > > > > + * @obj: the &drm_gem_object to check > > > > > + * > > > > > + * Returns: true if the &drm_gem_object &dma_resv differs > > > > > from > > > > > the > > > > > + * &drm_gpuvms &dma_resv, false otherwise > > > > > + */ > > > > > +static inline bool drm_gpuvm_is_extobj(struct drm_gpuvm > > > > > *gpuvm, > > > > > + struct drm_gem_object *obj) > > > > > +{ > > > > > + return obj && obj->resv != gpuvm->resv; > > > > > +} > > > > > + > > > > > static inline struct drm_gpuva * > > > > > __drm_gpuva_next(struct drm_gpuva *va) > > > > > { > > > > > @@ -346,6 +395,128 @@ __drm_gpuva_next(struct drm_gpuva *va) > > > > > #define drm_gpuvm_for_each_va_safe(va__, next__, gpuvm__) > > > > > \ > > > > > list_for_each_entry_safe(va__, next__, &(gpuvm__)- > > > > > >rb.list, > > > > > rb.entry) > > > > > +/** > > > > > + * struct drm_gpuvm_exec - &drm_gpuvm abstraction of > > > > > &drm_exec > > > > > + * > > > > > + * This structure should be created on the stack as > > > > > &drm_exec > > > > > should be. > > > > > + * > > > > > + * Optionally, @extra can be set in order to lock additional > > > > > &drm_gem_objects. > > > > > + */ > > > > > +struct drm_gpuvm_exec { > > > > > + /** > > > > > + * @exec: the &drm_exec structure > > > > > + */ > > > > > + struct drm_exec exec; > > > > > + > > > > > + /** > > > > > + * @vm: the &drm_gpuvm to lock its DMA reservations > > > > > + */ > > > > > + struct drm_gpuvm *vm; > > > > > + > > > > > + /** > > > > > + * @extra: Callback and corresponding private data for > > > > > the > > > > > driver to > > > > > + * lock arbitrary additional &drm_gem_objects. > > > > > + */ > > > > > + struct { > > > > > + /** > > > > > + * @fn: The driver callback to lock additional > > > > > &drm_gem_objects. > > > > > + */ > > > > > + int (*fn)(struct drm_gpuvm_exec *vm_exec, > > > > > + unsigned int num_fences); > > > > > + > > > > > + /** > > > > > + * @priv: driver private data for the @fn callback > > > > > + */ > > > > > + void *priv; > > > > > + } extra; > > > > > +}; > > > > > + > > > > > +/** > > > > > + * drm_gpuvm_prepare_vm() - prepare the GPUVMs common dma- > > > > > resv > > > > > + * @gpuvm: the &drm_gpuvm > > > > > + * @exec: the &drm_exec context > > > > > + * @num_fences: the amount of &dma_fences to reserve > > > > > + * > > > > > + * Calls drm_exec_prepare_obj() for the GPUVMs dummy > > > > > &drm_gem_object. > > > > > + * > > > > > + * Using this function directly, it is the drivers > > > > > responsibility to call > > > > > + * drm_exec_init() and drm_exec_fini() accordingly. > > > > > + * > > > > > + * Returns: 0 on success, negative error code on failure. > > > > > + */ > > > > > +static inline int > > > > > +drm_gpuvm_prepare_vm(struct drm_gpuvm *gpuvm, > > > > > + struct drm_exec *exec, > > > > > + unsigned int num_fences) > > > > > +{ > > > > > + return drm_exec_prepare_obj(exec, &gpuvm->d_obj, > > > > > num_fences); > > > > > +} > > > > > + > > > > > +int drm_gpuvm_prepare_objects(struct drm_gpuvm *gpuvm, > > > > > + struct drm_exec *exec, > > > > > + unsigned int num_fences); > > > > > + > > > > > +int drm_gpuvm_prepare_range(struct drm_gpuvm *gpuvm, > > > > > + struct drm_exec *exec, > > > > > + u64 addr, u64 range, > > > > > + unsigned int num_fences); > > > > > + > > > > > +int drm_gpuvm_exec_lock(struct drm_gpuvm_exec *vm_exec, > > > > > + unsigned int num_fences, > > > > > + bool interruptible); > > > > > + > > > > > +int drm_gpuvm_exec_lock_array(struct drm_gpuvm_exec > > > > > *vm_exec, > > > > > + struct drm_gem_object **objs, > > > > > + unsigned int num_objs, > > > > > + unsigned int num_fences, > > > > > + bool interruptible); > > > > > + > > > > > +int drm_gpuvm_exec_lock_range(struct drm_gpuvm_exec > > > > > *vm_exec, > > > > > + u64 addr, u64 range, > > > > > + unsigned int num_fences, > > > > > + bool interruptible); > > > > > + > > > > > +/** > > > > > + * drm_gpuvm_lock() - lock all dma-resv of all assoiciated > > > > > BOs > > > > > + * @gpuvm: the &drm_gpuvm > > > > > + * > > > > > + * Releases all dma-resv locks of all &drm_gem_objects > > > > > previously acquired > > > > > + * through drm_gpuvm_lock() or its variants. > > > > > + * > > > > > + * Returns: 0 on success, negative error code on failure. > > > > > + */ > > > > > +static inline void > > > > > +drm_gpuvm_exec_unlock(struct drm_gpuvm_exec *vm_exec) > > > > > +{ > > > > > + drm_exec_fini(&vm_exec->exec); > > > > > +} > > > > > + > > > > > +int drm_gpuvm_validate(struct drm_gpuvm *gpuvm); > > > > > +void drm_gpuvm_resv_add_fence(struct drm_gpuvm *gpuvm, > > > > > + struct drm_exec *exec, > > > > > + struct dma_fence *fence, > > > > > + enum dma_resv_usage private_usage, > > > > > + enum dma_resv_usage extobj_usage); > > > > > + > > > > > +/** > > > > > + * drm_gpuvm_exec_resv_add_fence() > > > > > + * @vm_exec: the &drm_gpuvm_exec abstraction > > > > > + * @fence: fence to add > > > > > + * @private_usage: private dma-resv usage > > > > > + * @extobj_usage: extobj dma-resv usage > > > > > + * > > > > > + * See drm_gpuvm_resv_add_fence(). > > > > > + */ > > > > > +static inline void > > > > > +drm_gpuvm_exec_resv_add_fence(struct drm_gpuvm_exec > > > > > *vm_exec, > > > > > + struct dma_fence *fence, > > > > > + enum dma_resv_usage private_usage, > > > > > + enum dma_resv_usage extobj_usage) > > > > > +{ > > > > > + drm_gpuvm_resv_add_fence(vm_exec->vm, &vm_exec->exec, > > > > > fence, > > > > > + private_usage, extobj_usage); > > > > > +} > > > > > + > > > > > /** > > > > > * struct drm_gpuvm_bo - structure representing a > > > > > &drm_gpuvm > > > > > and > > > > > * &drm_gem_object combination > > > > > @@ -398,6 +569,18 @@ struct drm_gpuvm_bo { > > > > > * gpuva list. > > > > > */ > > > > > struct list_head gem; > > > > > + > > > > > + /** > > > > > + * @evict: List entry to attach to the > > > > > &drm_gpuvms > > > > > + * extobj list. > > > > > + */ > > > > > + struct list_head extobj; > > > > > + > > > > > + /** > > > > > + * @evict: List entry to attach to the > > > > > &drm_gpuvms > > > > > evict > > > > > + * list. > > > > > + */ > > > > > + struct list_head evict; > > > > > } entry; > > > > > } list; > > > > > }; > > > > > @@ -432,6 +615,9 @@ struct drm_gpuvm_bo * > > > > > drm_gpuvm_bo_find(struct drm_gpuvm *gpuvm, > > > > > struct drm_gem_object *obj); > > > > > +void drm_gpuvm_bo_evict(struct drm_gem_object *obj, bool > > > > > evict); > > > > > +void drm_gpuvm_bo_extobj_add(struct drm_gpuvm_bo *vm_bo); > > > > > + > > > > > /** > > > > > * drm_gpuvm_bo_for_each_va() - iterator to walk over a > > > > > list of > > > > > &drm_gpuva > > > > > * @va__: &drm_gpuva structure to assign to in each > > > > > iteration > > > > > step > > > > > @@ -837,6 +1023,17 @@ struct drm_gpuvm_ops { > > > > > * used. > > > > > */ > > > > > int (*sm_step_unmap)(struct drm_gpuva_op *op, void > > > > > *priv); > > > > > + > > > > > + /** > > > > > + * @bo_validate: called from drm_gpuvm_validate() > > > > > + * > > > > > + * Drivers receive this callback for every evicted > > > > > &drm_gem_object being > > > > > + * mapped in the corresponding &drm_gpuvm. > > > > > + * > > > > > + * Typically, drivers would call their driver specific > > > > > variant of > > > > > + * ttm_bo_validate() from within this callback. > > > > > + */ > > > > > + int (*bo_validate)(struct drm_gem_object *obj); > > > > > > > > Same here. Could we have a vm_bo as an argument instead, so > > > > that > > > > the callback knows what gpuvm we're targeting and can mark all > > > > its > > > > gpu_vas for revalidation? Or is that intended to be done > > > > elsewhere? > > > > > > Makes sense as well. I'll change that too. > > > > I forgot, drm_gpuvm_validate() would preferably take an > > drm_gpuvm_exec > > argument because we need it in the validate callback. It's also > > easy > > for the driver to subclass further if needed, to pass even more > > arguments to its validate callback. > > Hm.. that implies that a driver open coding the drm_exec loop, still > needs > to use a struct drm_gpuvm_exec rather than just a struct drm_exec. > What is > this needed for in Xe? Do we expect other drivers needing it? Might a > priv > void pointer maybe make more sense? It's for sleeping locks during eviction rather than trylocks. TTM currently fishes out the struct ww_acquire_context used for locking from the lock itself, but I'd expect that to be more explicit in the near future with a variant of ttm_bo_validate() that actually explicitly takes a drm_exec as argument. So we would probably also like to try to find a way to encourage drivers to include the validate() in the until_all_locked() loop, because if TTM resorts to a sleeping lock *after* that loop, the following warning will be hit: https://elixir.bootlin.com/linux/latest/source/kernel/locking/ww_mutex.h#L195 So not sure what's best, but perhaps then a struct drm_exec * or a (void *) /Thomas > > > > > /Thomas > > > > > > > > > > > > > > > > }; > > > > > int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv, > > > > > > > > Thanks, > > > > > > > > Thomas > > > > > > > > > > > > > >