On Mon, Apr 24, 2017 at 02:19:54PM +0100, Chris Wilson wrote: > On Mon, Apr 24, 2017 at 02:03:25PM +0100, Tvrtko Ursulin wrote: > > > > On 19/04/2017 10:41, Chris Wilson wrote: > > >Track the latest fence waited upon on each context, and only add a new > > >asynchronous wait if the new fence is more recent than the recorded > > >fence for that context. This requires us to filter out unordered > > >timelines, which are noted by DMA_FENCE_NO_CONTEXT. However, in the > > >absence of a universal identifier, we have to use our own > > >i915->mm.unordered_timeline token. > > > > (._.), a bit later... @_@! > > > > What does this fixes and is the complexity worth it? > > It's a recovery of the optimisation that we used to have from the > initial multiple engine semaphore synchronisation - that of avoiding > repeating the same synchronisation barriers. > > In the current setup, the cost of repeat fence synchronisation is > obfuscated, it just causes a tight loop between > > /<---------------------------------------------\ > | ^ > i915_sw_fence_complete -> i915_sw_fence_commit ->| > > and extra depth in the dependency trees, which is generally not > observed in normal usage. > > When you know what you are looking for, the reduction of all those > atomic ops from underneath hardirq is definitely worth it, even for > fairly simply operations, and there tends to be repetition from all he > buffers being tracked between requests (and clients). And it also says, to me at least, that the cost of the lookup must be less than the cost of a couple of atomics. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx