Quoting Tvrtko Ursulin (2019-09-16 11:13:23) > > On 02/09/2019 05:02, Chris Wilson wrote: > > Replace the struct_mutex requirement for pinning the i915_vma with the > > local vm->mutex instead. Note that the vm->mutex is tainted by the > > shrinker (we require unbinding from inside fs-reclaim) and so we cannot > > allocate while holding that mutex. Instead we have to preallocate > > workers to do allocate and apply the PTE updates after we have we > > reserved their slot in the drm_mm (using fences to order the PTE writes > > with the GPU work and with later unbind). > > Can you put some paragraphs into the commit describing the > infrastructure changes? Like changes to active tracker at least. You mean the addition of the preallocated node? > Then commentary on effects on shrinker, fences, object management, > execbuf, basically all major parts of the code which have now been > fundamentally improved or changed. It would help with review. It's a > chunky patch/change and I think it needs some theory of operation text. Off the top of my held, they are all just lock switches. The funky stuff lies around avoiding introducing a kref for i915_vma and that requires a whole bunch of trylocks to avoid double-frees. (I did plan on having a kref here, but that leads to rewriting object page tracking which is in its own quagmire.) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c > > index f99920652751..9e72b42a86f5 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c > > @@ -152,6 +152,17 @@ static void clear_pages_dma_fence_cb(struct dma_fence *fence, > > irq_work_queue(&w->irq_work); > > } > > > > +static int move_to_active(struct i915_vma *vma, struct i915_request *rq) > > +{ > > + int err; > > + > > + err = i915_request_await_active(rq, &vma->active); > > + if (err) > > + return err; > > + > > + return i915_active_ref(&vma->active, rq->timeline, rq); > > +} > > + > > static void clear_pages_worker(struct work_struct *work) > > { > > struct clear_pages_work *w = container_of(work, typeof(*w), work); > > @@ -211,7 +222,7 @@ static void clear_pages_worker(struct work_struct *work) > > * keep track of the GPU activity within this vma/request, and > > * propagate the signal from the request to w->dma. > > */ > > - err = i915_active_ref(&vma->active, rq->timeline, rq); > > + err = move_to_active(vma, rq); > > What is happening here? It wasn't sufficiently ordered before or > something changes with this patch? We are introducing an independent ordering constraints. The clear page worker is itself interwined with the obj->resv but one part of the operation must be serialised with the asynchronous binding, and we don't want that async binding to affect the object globally. So the clear-page workers opencodes the awaits from i915_vma_move_to_active, but does not want to partake in the fence export (as the operations is already in the obj->resv). > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > index f1c0e5d958f3..653f7275306a 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > @@ -313,8 +313,6 @@ static void i915_gem_context_free(struct i915_gem_context *ctx) > > GEM_BUG_ON(!i915_gem_context_is_closed(ctx)); > > > > release_hw_id(ctx); > > - if (ctx->vm) > > - i915_vm_put(ctx->vm); > > Contexts no longer hold a reference to vm? Only vmas? Dropped earlier on close rather than final put... > > @@ -379,9 +377,13 @@ void i915_gem_context_release(struct kref *ref) > > > > static void context_close(struct i915_gem_context *ctx) > > { > > + i915_gem_context_set_closed(ctx); > > + > > + if (ctx->vm) > > + i915_vm_close(ctx->vm); > > But now closed context mean closed vm, but what about shared vms? Is > open/close_vm now counted? Yes. vm->open_count. This is all to avoid double-free of the vma from object-close and vm-close. But closing earlier tidies up the vm release and restores some earlier behaviour to avoid unbinding from a dead vm. > > - ret = i915_vma_bind(vma, cache_level, PIN_UPDATE); > > + /* Wait for an earlier async bind */ > > + ret = i915_active_wait(&vma->active); > > + if (ret) > > + return ret; > > + > > + ret = i915_vma_bind(vma, cache_level, PIN_UPDATE, NULL); > > Waiting should not be implied in the bind? I am assuming at least there > will be many callers to bind and like this all of them will have to know > to do i915_active_wait first? Not really, imo, i915_vma_bind() is a special case as it is the low-level operation. And we only need the sync here as we plan to rewrite those ongoing operations; in the other (non-UPDATE) caller, we are adding a separate binding rather than overwriting existing. > > @@ -483,6 +487,7 @@ static void i915_gem_object_bump_inactive_ggtt(struct drm_i915_gem_object *obj) > > if (!drm_mm_node_allocated(&vma->node)) > > continue; > > > > + GEM_BUG_ON(vma->vm != &i915->ggtt.vm); > > list_move_tail(&vma->vm_link, &vma->vm->bound_list); > > } > > mutex_unlock(&i915->ggtt.vm.mutex); > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > > index c049199a1df5..c3dd8ce7e2b7 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > > @@ -550,8 +550,11 @@ eb_add_vma(struct i915_execbuffer *eb, > > eb_unreserve_vma(vma, vma->exec_flags); > > > > list_add_tail(&vma->exec_link, &eb->unbound); > > - if (drm_mm_node_allocated(&vma->node)) > > + if (drm_mm_node_allocated(&vma->node)) { > > + mutex_lock(&vma->vm->mutex); > > err = i915_vma_unbind(vma); > > + mutex_unlock(&vma->vm->mutex); > > Should there be __i915_vma_unbind which asserts the lock and > i915_vma_unbind which takes it? Could do, it seems to be an even split. I had a goal that i915_vma_unbind() should evaporate; it should become a pipelined operation for the most part, just the shrinker being always an exception as our interface there is external. > > +void i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f) > > +{ > > + GEM_BUG_ON(i915_active_is_idle(ref)); > > + > > + dma_fence_get(f); > > + > > + rcu_read_lock(); > > + if (rcu_access_pointer(ref->excl)) { > > + struct dma_fence *old; > > + > > + old = dma_fence_get_rcu_safe(&ref->excl); > > + if (old) { > > + if (dma_fence_remove_callback(old, &ref->excl_cb)) > > + atomic_dec(&ref->count); > > + dma_fence_put(old); > > + } > > + } > > + rcu_read_unlock(); > > + > > + atomic_inc(&ref->count); > > + rcu_assign_pointer(ref->excl, f); > > + > > + if (dma_fence_add_callback(f, &ref->excl_cb, excl_cb)) { > > + RCU_INIT_POINTER(ref->excl, NULL); > > + atomic_dec(&ref->count); > > + dma_fence_put(f); > > Is this some silent failure path? No, it's just a weird indication by dma-fence that the fence was already signaled. > > int i915_request_await_active(struct i915_request *rq, struct i915_active *ref) > > { > > - struct active_node *it, *n; > > - int err; > > - > > - if (RB_EMPTY_ROOT(&ref->tree)) > > - return 0; > > + int err = 0; > > > > - /* await allocates and so we need to avoid hitting the shrinker */ > > - err = i915_active_acquire(ref); > > - if (err) > > - return err; > > + if (rcu_access_pointer(ref->excl)) { > > + struct dma_fence *fence; > > > > - mutex_lock(&ref->mutex); > > - rbtree_postorder_for_each_entry_safe(it, n, &ref->tree, node) { > > - err = i915_request_await_active_request(rq, &it->base); > > - if (err) > > - break; > > + rcu_read_lock(); > > + fence = dma_fence_get_rcu_safe(&ref->excl); > > + rcu_read_unlock(); > > + if (fence) { > > + err = i915_request_await_dma_fence(rq, fence); > > + dma_fence_put(fence); > > + } > > } > > - mutex_unlock(&ref->mutex); > > > > - i915_active_release(ref); > > + /* In the future we may choose to await on all fences */ > > + > > This is completely changed to just wait on the "exclusive" fence? What > about the tree of nodes? It's still in the data structure and managed > elsewhere. Not needed any more until "future" from the comment? It was all dead code. The use case I have for it atm is purely about managing that exclusive timeline. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx