Quoting Chris Wilson (2020-07-29 14:42:06) > Quoting Tvrtko Ursulin (2020-07-29 13:40:38) > > > > On 28/07/2020 15:28, Chris Wilson wrote: > > > Quoting Tvrtko Ursulin (2020-07-17 14:04:58) > > >> > > >> On 15/07/2020 12:50, Chris Wilson wrote: > > >>> Rather than require the next timeline after idling to match the MRU > > >>> before idling, reset the index on the node and allow it to match the > > >>> first request. However, this requires cmpxchg(u64) and so is not trivial > > >>> on 32b, so for compatibility we just fallback to keeping the cached node > > >>> pointing to the MRU timeline. > > >>> > > >>> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > >>> --- > > >>> drivers/gpu/drm/i915/i915_active.c | 21 +++++++++++++++++++-- > > >>> 1 file changed, 19 insertions(+), 2 deletions(-) > > >>> > > >>> diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c > > >>> index 0854b1552bc1..6737b5615c0c 100644 > > >>> --- a/drivers/gpu/drm/i915/i915_active.c > > >>> +++ b/drivers/gpu/drm/i915/i915_active.c > > >>> @@ -157,6 +157,10 @@ __active_retire(struct i915_active *ref) > > >>> rb_link_node(&ref->cache->node, NULL, &ref->tree.rb_node); > > >>> rb_insert_color(&ref->cache->node, &ref->tree); > > >>> GEM_BUG_ON(ref->tree.rb_node != &ref->cache->node); > > >>> + > > >>> + /* Make the cached node available for reuse with any timeline */ > > >>> + if (IS_ENABLED(CONFIG_64BIT)) > > >>> + ref->cache->timeline = 0; /* needs cmpxchg(u64) */ > > >> > > >> Or when fence context wraps shock horror. > > > > > > I more concerned about that we use timeline:0 as a special unordered > > > timeline. It's reserved by use in the dma_fence_stub, and everything > > > will start to break when the timelines wrap. The earliest causalities > > > will be the kernel_context timelines which are also very special indices > > > for the barriers. > > > > > >> > > >>> } > > >>> > > >>> spin_unlock_irqrestore(&ref->tree_lock, flags); > > >>> @@ -235,9 +239,22 @@ static struct active_node *__active_lookup(struct i915_active *ref, u64 idx) > > >>> { > > >>> struct active_node *it; > > >>> > > >>> + GEM_BUG_ON(idx == 0); /* 0 is the unordered timeline, rsvd for cache */ > > >>> + > > >>> it = READ_ONCE(ref->cache); > > >>> - if (it && it->timeline == idx) > > >>> - return it; > > >>> + if (it) { > > >>> + u64 cached = READ_ONCE(it->timeline); > > >>> + > > >>> + if (cached == idx) > > >>> + return it; > > >>> + > > >>> +#ifdef CONFIG_64BIT /* for cmpxchg(u64) */ > > >>> + if (!cached && !cmpxchg(&it->timeline, 0, idx)) { > > >>> + GEM_BUG_ON(i915_active_fence_isset(&it->base)); > > >>> + return it; > > >> > > >> cpmxchg suggests this needs to be atomic, however above the check for > > >> equality comes from a separate read. > > > > > > That's fine, and quite common to avoid cmpxchg if the current value > > > already does not match the expected condition. > > > > How? What is another thread is about to install its idx into > > it->timeline with cmpxchg and this thread does not see it because it > > just returned on the "cached == idx" condition. > > Because it's nonzero. > > If the idx is already non-zero, it will always remain non-zero until > everybody idles (and there are no more threads). > > If the idx is zero, it can only transition to non-zero once, atomically > via cmpxchg. The first and only first cmpxchg will return that the > previous value was 0, and so return with it->idx == idx. As for the case that two threads are attempting to install 2 different fences into the same timeline slot -- that concurrency is controlled by the timeline mutex, or some other agreed upon serialisation for the slot [e.g. the exclusive slot doesn't have an intel_timeline associated with it, and some ranges uses mutexes other than intel_timeline.] -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx