On Wed, 2022-01-05 at 15:52 +0000, Matthew Auld wrote: > On 04/01/2022 12:51, Thomas Hellström wrote: > > Implement async (non-blocking) unbinding by not syncing the vma > > before > > calling unbind on the vma_resource. > > Add the resulting unbind fence to the object's dma_resv from where > > it is > > picked up by the ttm migration code. > > Ideally these unbind fences should be coalesced with the migration > > blit > > fence to avoid stalling the migration blit waiting for unbind, as > > they > > can certainly go on in parallel, but since we don't yet have a > > reasonable data structure to use to coalesce fences and attach the > > resulting fence to a timeline, we defer that for now. > > > > Note that with async unbinding, even while the unbind waits for the > > preceding bind to complete before unbinding, the vma itself might > > have been > > destroyed in the process, clearing the vma pages. Therefore we can > > only allow async unbinding if we have a refcounted sg-list and keep > > a > > refcount on that for the vma resource pages to stay intact until > > binding occurs. If this condition is not met, a request for an > > async > > unbind is diverted to a sync unbind. > > > > v2: > > - Use a separate kmem_cache for vma resources for now to isolate > > their > > memory allocation and aid debugging. > > - Move the check for vm closed to the actual unbinding thread. > > Regardless > > of whether the vm is closed, we need the unbind fence to > > properly wait > > for capture. > > - Clear vma_res::vm on unbind and update its documentation. > > v4: > > - Take cache coloring into account when searching for vma resources > > pending unbind. (Matthew Auld) > > v5: > > - Fix timeout and error check in > > i915_vma_resource_bind_dep_await(). > > - Avoid taking a reference on the object for async binding if > > async unbind capable. > > - Fix braces around a single-line if statement. > > > > Signed-off-by: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> > > <snip> > > > @@ -434,12 +439,30 @@ int i915_vma_bind(struct i915_vma *vma, > > > > bind_flags &= ~vma_flags; > > if (bind_flags == 0) { > > - kfree(vma_res); > > + i915_vma_resource_free(vma_res); > > return 0; > > } > > > > GEM_BUG_ON(!atomic_read(&vma->pages_count)); > > > > + /* Wait for or await async unbinds touching our range */ > > + if (work && bind_flags & vma->vm->bind_async_flags) > > + ret = i915_vma_resource_bind_dep_await(vma->vm, > > + &work- > > >base.chain, > > + vma- > > >node.start, > > + vma- > > >node.size, > > + true, > > + GFP_NOWAIT | > > + > > __GFP_RETRY_MAYFAIL | > > + > > __GFP_NOWARN); > > + else > > + ret = i915_vma_resource_bind_dep_sync(vma->vm, vma- > > >node.start, > > + vma- > > >node.size, true); > > + if (ret) { > > + i915_vma_resource_free(vma_res); > > + return ret; > > + } > > + > > if (vma->resource || !vma_res) { > > /* Rebinding with an additional I915_VMA_*_BIND */ > > GEM_WARN_ON(!vma_flags); > > @@ -452,9 +475,11 @@ int i915_vma_bind(struct i915_vma *vma, > > if (work && bind_flags & vma->vm->bind_async_flags) { > > struct dma_fence *prev; > > > > - work->vma = vma; > > + work->vma_res = i915_vma_resource_get(vma- > > >resource); > > work->cache_level = cache_level; > > work->flags = bind_flags; > > + if (vma->obj->mm.rsgt) > > + work->rsgt = i915_refct_sgt_get(vma->obj- > > >mm.rsgt); > > Hmmm, at a glance I would have expected this to use the vma->pages. I > think with the GGTT the vma will often create its own sg layout which > != > obj->mm.sgt. IIUC the async unbind will still call vma_unbind_pages > which might nuke the vma sgt? Or is something else going on here? > Yes, the binding code is only using vma_res->pages, which should have been copied from vma->pages, and keeps a reference to the rsgt just in case we do an async unbind. However good point we should refuse async unbind for now if vma_res- >pages != &rsgt->table, because the former might otherwise be nuked before the async unbind actually happens. Will fix that for next version. /Thomas