Hi, On Mon, 2022-01-17 at 21:07 +0000, Robert Beckett wrote: > > > On 10/01/2022 17:22, Thomas Hellström wrote: > > Implement async (non-blocking) unbinding by not syncing the vma > > before > > calling unbind on the vma_resource. > > Add the resulting unbind fence to the object's dma_resv from where > > it is > > picked up by the ttm migration code. > > Ideally these unbind fences should be coalesced with the migration > > blit > > fence to avoid stalling the migration blit waiting for unbind, as > > they > > can certainly go on in parallel, but since we don't yet have a > > reasonable data structure to use to coalesce fences and attach the > > resulting fence to a timeline, we defer that for now. > > > > Note that with async unbinding, even while the unbind waits for the > > preceding bind to complete before unbinding, the vma itself might > > have been > > destroyed in the process, clearing the vma pages. Therefore we can > > only allow async unbinding if we have a refcounted sg-list and keep > > a > > refcount on that for the vma resource pages to stay intact until > > binding occurs. If this condition is not met, a request for an > > async > > unbind is diverted to a sync unbind. > > > > v2: > > - Use a separate kmem_cache for vma resources for now to isolate > > their > > memory allocation and aid debugging. > > - Move the check for vm closed to the actual unbinding thread. > > Regardless > > of whether the vm is closed, we need the unbind fence to > > properly wait > > for capture. > > - Clear vma_res::vm on unbind and update its documentation. > > v4: > > - Take cache coloring into account when searching for vma resources > > pending unbind. (Matthew Auld) > > v5: > > - Fix timeout and error check in > > i915_vma_resource_bind_dep_await(). > > - Avoid taking a reference on the object for async binding if > > async unbind capable. > > - Fix braces around a single-line if statement. > > v6: > > - Fix up the cache coloring adjustment. (Kernel test robot > > <lkp@xxxxxxxxx>) > > - Don't allow async unbinding if the vma_res pages are not the same > > as > > the object pages. (Matthew Auld) > > v7: > > - s/unsigned long/u64/ in a number of places (Matthew Auld) > > > > Signed-off-by: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> > > Reviewed-by: Matthew Auld <matthew.auld@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c | 11 +- > > drivers/gpu/drm/i915/gt/intel_ggtt.c | 2 +- > > drivers/gpu/drm/i915/gt/intel_gtt.c | 4 + > > drivers/gpu/drm/i915/gt/intel_gtt.h | 3 + > > drivers/gpu/drm/i915/i915_drv.h | 1 + > > drivers/gpu/drm/i915/i915_gem.c | 12 +- > > drivers/gpu/drm/i915/i915_module.c | 3 + > > drivers/gpu/drm/i915/i915_vma.c | 205 +++++++++-- > > drivers/gpu/drm/i915/i915_vma.h | 3 +- > > drivers/gpu/drm/i915/i915_vma_resource.c | 354 > > +++++++++++++++++-- > > drivers/gpu/drm/i915/i915_vma_resource.h | 48 +++ > > 11 files changed, 579 insertions(+), 67 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c > > b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c > > index 8653855d808b..1de306c03aaf 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c > > @@ -142,7 +142,16 @@ int i915_ttm_move_notify(struct > > ttm_buffer_object *bo) > > struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); > > int ret; > > > > - ret = i915_gem_object_unbind(obj, > > I915_GEM_OBJECT_UNBIND_ACTIVE); > > + /* > > + * Note: The async unbinding here will actually transform > > the > > + * blocking wait for unbind into a wait before finally > > submitting > > + * evict / migration blit and thus stall the migration > > timeline > > + * which may not be good for overall throughput. We should > > make > > + * sure we await the unbind fences *after* the migration > > blit > > + * instead of *before* as we currently do. > > + */ > > + ret = i915_gem_object_unbind(obj, > > I915_GEM_OBJECT_UNBIND_ACTIVE | > > + I915_GEM_OBJECT_UNBIND_ASYNC); > > if (ret) > > return ret; > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c > > b/drivers/gpu/drm/i915/gt/intel_ggtt.c > > index e49b6250c4b7..a1b2761bc16e 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c > > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c > > @@ -142,7 +142,7 @@ void i915_ggtt_suspend_vm(struct > > i915_address_space *vm) > > continue; > > > > if (!i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND)) > > { > > - __i915_vma_evict(vma); > > + __i915_vma_evict(vma, false); > > drm_mm_remove_node(&vma->node); > > } > > } > > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c > > b/drivers/gpu/drm/i915/gt/intel_gtt.c > > index a94be0306464..46be4197b93f 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_gtt.c > > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c > > @@ -161,6 +161,9 @@ static void __i915_vm_release(struct > > work_struct *work) > > struct i915_address_space *vm = > > container_of(work, struct i915_address_space, > > release_work); > > > > + /* Synchronize async unbinds. */ > > + i915_vma_resource_bind_dep_sync_all(vm); > > + > > vm->cleanup(vm); > > i915_address_space_fini(vm); > > > > @@ -189,6 +192,7 @@ void i915_address_space_init(struct > > i915_address_space *vm, int subclass) > > if (!kref_read(&vm->resv_ref)) > > kref_init(&vm->resv_ref); > > > > + vm->pending_unbind = RB_ROOT_CACHED; > > INIT_WORK(&vm->release_work, __i915_vm_release); > > atomic_set(&vm->open, 1); > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h > > b/drivers/gpu/drm/i915/gt/intel_gtt.h > > index 676b839d1a34..8073438b67c8 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_gtt.h > > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h > > @@ -265,6 +265,9 @@ struct i915_address_space { > > /* Flags used when creating page-table objects for this vm > > */ > > unsigned long lmem_pt_obj_flags; > > > > + /* Interval tree for pending unbind vma resources */ > > + struct rb_root_cached pending_unbind; > > + > > struct drm_i915_gem_object * > > (*alloc_pt_dma)(struct i915_address_space *vm, int > > sz); > > struct drm_i915_gem_object * > > diff --git a/drivers/gpu/drm/i915/i915_drv.h > > b/drivers/gpu/drm/i915/i915_drv.h > > index a44e0c3298fc..6caec2eca8cd 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.h > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > @@ -1661,6 +1661,7 @@ int i915_gem_object_unbind(struct > > drm_i915_gem_object *obj, > > #define I915_GEM_OBJECT_UNBIND_BARRIER BIT(1) > > #define I915_GEM_OBJECT_UNBIND_TEST BIT(2) > > #define I915_GEM_OBJECT_UNBIND_VM_TRYLOCK BIT(3) > > +#define I915_GEM_OBJECT_UNBIND_ASYNC BIT(4) > > > > void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv); > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c > > b/drivers/gpu/drm/i915/i915_gem.c > > index 3729ec01b5bc..7188db42c85b 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -157,10 +157,16 @@ int i915_gem_object_unbind(struct > > drm_i915_gem_object *obj, > > spin_unlock(&obj->vma.lock); > > > > if (vma) { > > + bool vm_trylock = !!(flags & > > I915_GEM_OBJECT_UNBIND_VM_TRYLOCK); > > ret = -EBUSY; > > - if (flags & I915_GEM_OBJECT_UNBIND_ACTIVE > > || > > - !i915_vma_is_active(vma)) { > > - if (flags & > > I915_GEM_OBJECT_UNBIND_VM_TRYLOCK) { > > + if (flags & I915_GEM_OBJECT_UNBIND_ASYNC) { > > + assert_object_held(vma->obj); > > + ret = i915_vma_unbind_async(vma, > > vm_trylock); > > + } > > + > > + if (ret == -EBUSY && (flags & > > I915_GEM_OBJECT_UNBIND_ACTIVE || > > + > > !i915_vma_is_active(vma))) { > > + if (vm_trylock) { > > if (mutex_trylock(&vma->vm- > > >mutex)) { > > ret = > > __i915_vma_unbind(vma); > > mutex_unlock(&vma- > > >vm->mutex); > > diff --git a/drivers/gpu/drm/i915/i915_module.c > > b/drivers/gpu/drm/i915/i915_module.c > > index f6bcd2f89257..a8f175960b34 100644 > > --- a/drivers/gpu/drm/i915/i915_module.c > > +++ b/drivers/gpu/drm/i915/i915_module.c > > @@ -17,6 +17,7 @@ > > #include "i915_scheduler.h" > > #include "i915_selftest.h" > > #include "i915_vma.h" > > +#include "i915_vma_resource.h" > > > > static int i915_check_nomodeset(void) > > { > > @@ -64,6 +65,8 @@ static const struct { > > .exit = i915_scheduler_module_exit }, > > { .init = i915_vma_module_init, > > .exit = i915_vma_module_exit }, > > + { .init = i915_vma_resource_module_init, > > + .exit = i915_vma_resource_module_exit }, > > { .init = i915_mock_selftests }, > > { .init = i915_pmu_init, > > .exit = i915_pmu_exit }, > > diff --git a/drivers/gpu/drm/i915/i915_vma.c > > b/drivers/gpu/drm/i915/i915_vma.c > > index 29c770a764aa..5881b0713b1a 100644 > > --- a/drivers/gpu/drm/i915/i915_vma.c > > +++ b/drivers/gpu/drm/i915/i915_vma.c > > @@ -286,9 +286,10 @@ struct i915_vma_work { > > struct dma_fence_work base; > > struct i915_address_space *vm; > > struct i915_vm_pt_stash stash; > > - struct i915_vma *vma; > > + struct i915_vma_resource *vma_res; > > struct drm_i915_gem_object *pinned; > > struct i915_sw_dma_fence_cb cb; > > + struct i915_refct_sgt *rsgt; > > enum i915_cache_level cache_level; > > unsigned int flags; > > }; > > @@ -296,10 +297,11 @@ struct i915_vma_work { > > static void __vma_bind(struct dma_fence_work *work) > > { > > struct i915_vma_work *vw = container_of(work, typeof(*vw), > > base); > > - struct i915_vma *vma = vw->vma; > > + struct i915_vma_resource *vma_res = vw->vma_res; > > + > > + vma_res->ops->bind_vma(vma_res->vm, &vw->stash, > > + vma_res, vw->cache_level, vw- > > >flags); > > > > - vma->ops->bind_vma(vw->vm, &vw->stash, > > - vma->resource, vw->cache_level, vw- > > >flags); > > } > > > > static void __vma_release(struct dma_fence_work *work) > > @@ -311,6 +313,10 @@ static void __vma_release(struct > > dma_fence_work *work) > > > > i915_vm_free_pt_stash(vw->vm, &vw->stash); > > i915_vm_put(vw->vm); > > + if (vw->vma_res) > > + i915_vma_resource_put(vw->vma_res); > > + if (vw->rsgt) > > + i915_refct_sgt_put(vw->rsgt); > > } > > > > static const struct dma_fence_work_ops bind_ops = { > > @@ -380,13 +386,11 @@ i915_vma_resource_init_from_vma(struct > > i915_vma_resource *vma_res, > > { > > struct drm_i915_gem_object *obj = vma->obj; > > > > - i915_vma_resource_init(vma_res, vma->pages, &vma- > > >page_sizes, > > + i915_vma_resource_init(vma_res, vma->vm, vma->pages, &vma- > > >page_sizes, > > i915_gem_object_is_readonly(obj), > > i915_gem_object_is_lmem(obj), > > - vma->private, > > - vma->node.start, > > - vma->node.size, > > - vma->size); > > + vma->ops, vma->private, vma- > > >node.start, > > + vma->node.size, vma->size); > > } > > > > /** > > @@ -410,6 +414,7 @@ int i915_vma_bind(struct i915_vma *vma, > > { > > u32 bind_flags; > > u32 vma_flags; > > + int ret; > > > > lockdep_assert_held(&vma->vm->mutex); > > GEM_BUG_ON(!drm_mm_node_allocated(&vma->node)); > > @@ -418,12 +423,12 @@ int i915_vma_bind(struct i915_vma *vma, > > if (GEM_DEBUG_WARN_ON(range_overflows(vma->node.start, > > vma->node.size, > > vma->vm->total))) { > > - kfree(vma_res); > > + i915_vma_resource_free(vma_res); > > return -ENODEV; > > } > > > > if (GEM_DEBUG_WARN_ON(!flags)) { > > - kfree(vma_res); > > + i915_vma_resource_free(vma_res); > > return -EINVAL; > > } > > > > @@ -435,12 +440,30 @@ int i915_vma_bind(struct i915_vma *vma, > > > > bind_flags &= ~vma_flags; > > if (bind_flags == 0) { > > - kfree(vma_res); > > + i915_vma_resource_free(vma_res); > > return 0; > > } > > > > GEM_BUG_ON(!atomic_read(&vma->pages_count)); > > > > + /* Wait for or await async unbinds touching our range */ > > + if (work && bind_flags & vma->vm->bind_async_flags) > > + ret = i915_vma_resource_bind_dep_await(vma->vm, > > + &work- > > >base.chain, > > + vma- > > >node.start, > > + vma- > > >node.size, > > + true, > > + GFP_NOWAIT | > > + > > __GFP_RETRY_MAYFAIL | > > + > > __GFP_NOWARN); > > + else > > + ret = i915_vma_resource_bind_dep_sync(vma->vm, vma- > > >node.start, > > + vma- > > >node.size, true); > > + if (ret) { > > + i915_vma_resource_free(vma_res); > > + return ret; > > + } > > + > > if (vma->resource || !vma_res) { > > /* Rebinding with an additional I915_VMA_*_BIND */ > > GEM_WARN_ON(!vma_flags); > > You missed an extra `kfree(vma_res)` on the next line from here (not > in > patch context), which would presumably need to be > `i915_vma_resource_free(vma_res)` It should, thanks. I see I've missed that in an error path as well, as you mention below. > > Further to that, you are freeing it where it can already be known to > be > null, which, while relying on known zero side effect of double free > will > work, looks very unintuitive. This stems from NULL checking before kfree() is typically not done. IIRC even checkpatch.pl warns if doing that, but with transition to kmem_cache_free() that doesn't hold anymore so I need to fix this up. > > Firther to that, once this series is applied, `i915_vma_bind` only > appears to be called from `eb_relocate_entry()` with `vma_res` always > null, and from `i915_vma_pin_ww()` where it allocates `vma_res`. > > It would be much cleaner to just allocate the `vma_res` in > `i915_vma_bind` if required. That way it will avoid the null > asignment > on return to `i915_vma_pin_ww` with the equally unintuitive > `kfree(vma_res)` during the error fallthrough path at the end of > `i915_vma_pin_ww` on a knowingly null pointer (also that would need > to > be a `i915_vma_resource_free` if you keep it the way it is). We can't allocate (GFP_KERNEL) under the vm mutex, hence need to pass it as an argument. Thanks, Thomas