Quoting Tvrtko Ursulin (2018-07-04 12:34:04) > > On 04/07/2018 10:39, Tvrtko Ursulin wrote: > > > > On 04/07/2018 09:34, Chris Wilson wrote: > >> Using a VMA on more than one timeline concurrently is the exception > >> rather than the rule (using it concurrently on multiple engines). As we > >> expect to only use one active tracker, store the most recently used > >> tracker inside the i915_vma itself and only fallback to the rbtree if > >> we need a second or more concurrent active trackers. > >> > >> v2: Comments on how we overwrite any existing last_active cache. > >> > >> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >> Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >> --- > >> drivers/gpu/drm/i915/i915_vma.c | 50 +++++++++++++++++++++++++++++++-- > >> drivers/gpu/drm/i915/i915_vma.h | 1 + > >> 2 files changed, 49 insertions(+), 2 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/i915/i915_vma.c > >> b/drivers/gpu/drm/i915/i915_vma.c > >> index cd94ffc7f079..33925e00f7e8 100644 > >> --- a/drivers/gpu/drm/i915/i915_vma.c > >> +++ b/drivers/gpu/drm/i915/i915_vma.c > >> @@ -119,6 +119,12 @@ i915_vma_retire(struct i915_gem_active *base, > >> struct i915_request *rq) > >> __i915_vma_retire(active->vma, rq); > >> } > >> +static void > >> +i915_vma_last_retire(struct i915_gem_active *base, struct > >> i915_request *rq) > >> +{ > >> + __i915_vma_retire(container_of(base, struct i915_vma, > >> last_active), rq); > >> +} > >> + > >> static struct i915_vma * > >> vma_create(struct drm_i915_gem_object *obj, > >> struct i915_address_space *vm, > >> @@ -136,6 +142,7 @@ vma_create(struct drm_i915_gem_object *obj, > >> vma->active = RB_ROOT; > >> + init_request_active(&vma->last_active, i915_vma_last_retire); > >> init_request_active(&vma->last_fence, NULL); > >> vma->vm = vm; > >> vma->ops = &vm->vma_ops; > >> @@ -895,6 +902,22 @@ static struct i915_gem_active > >> *lookup_active(struct i915_vma *vma, u64 idx) > >> { > >> struct i915_vma_active *active; > >> struct rb_node **p, *parent; > >> + struct i915_request *old; > >> + > >> + /* > >> + * We track the most recently used timeline to skip a rbtree search > >> + * for the common case, under typical loads we never need the rbtree > >> + * at all. We can reuse the last_active slot if it is empty, that is > >> + * after the previous activity has been retired, or if the active > >> + * matches the current timeline. > >> + */ > >> + old = i915_gem_active_raw(&vma->last_active, > >> + &vma->vm->i915->drm.struct_mutex); > >> + if (!old || old->fence.context == idx) > >> + goto out; > >> + > >> + /* Move the currently active fence into the rbtree */ > >> + idx = old->fence.context; > >> parent = NULL; > >> p = &vma->active.rb_node; > >> @@ -903,7 +926,7 @@ static struct i915_gem_active > >> *lookup_active(struct i915_vma *vma, u64 idx) > >> active = rb_entry(parent, struct i915_vma_active, node); > >> if (active->timeline == idx) > >> - return &active->base; > >> + goto replace; > >> if (active->timeline < idx) > >> p = &parent->rb_right; > >> @@ -922,7 +945,25 @@ static struct i915_gem_active > >> *lookup_active(struct i915_vma *vma, u64 idx) > >> rb_link_node(&active->node, parent, p); > >> rb_insert_color(&active->node, &vma->active); > >> - return &active->base; > >> +replace: > >> + /* > >> + * Overwrite the previous active slot in the rbtree with > >> last_active, > >> + * leaving last_active zeroed. If the previous slot is still active, > >> + * we must be careful as we now only expect to recieve one retire > > > > typo in receive > > > >> + * callback not two, and so much undo the active counting for the > >> + * overwritten slot. > >> + */ > >> + if (i915_gem_active_isset(&active->base)) { > >> + __list_del_entry(&active->base.link); > >> + vma->active_count--; > > > + GEM_BUG_ON(!vma->active_count); > > > > I still don't get this. The cache is exclusive, so when transferring a > > record from rbtree to last_active, why do we need to decrement the > > vma->active_count here? Don't get the part in the comment about two > > retires - do you really sometimes expect two - ie cache is not exclusive? > > > > But the fact that lookup of a cached entry is a straight return, meaning > > vma->active_count is manipulated elsewhere, makes me think it is > > avoidable messing with it on this path as well. > > > > Maybe the separation of duties between the callers and this function > > needs to be stronger. > > Hmm or is your cache actually inclusive? Don't see no rbtree > manipulation on migration to and from last_active/rbtree.. Both. Inclusive in the sense that both last_active and its timeline slot in the rbtree may be active tracking different requests and so receive retirement callbacks independently. Exclusive in that we don't store last_active in the cache slot and in the rbtree. > And since rbtree lookup is always for the last_active context id, you > would otherwise never hit the the "goto replace" path. > > How do you ever look up an id which is not cached in last_active then? We don't. We only lookup on evicting a still active request from last_active. The MRU recent request always goes into last_active. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx