On Tue, Apr 14, 2015 at 02:51:37PM +0100, Tvrtko Ursulin wrote: > > On 04/07/2015 04:20 PM, Chris Wilson wrote: > >Currently, we only track the last request globally across all engines. > >This prevents us from issuing concurrent read requests on e.g. the RCS > >and BCS engines (or more likely the render and media engines). Without > >semaphores, we incur costly stalls as we synchronise between rings - > >greatly impacting the current performance of Broadwell versus Haswell in > >certain workloads (like video decode). With the introduction of > >reference counted requests, it is much easier to track the last request > >per ring, as well as the last global write request so that we can > >optimise inter-engine read read requests (as well as better optimise > >certain CPU waits). > > > >v2: Fix inverted readonly condition for nonblocking waits. > >v3: Handle non-continguous engine array after waits > >v4: Rebase, tidy, rewrite ring list debugging > >v5: Use obj->active as a bitfield, it looks cool > >v6: Micro-optimise, mostly involving moving code around > >v7: Fix retire-requests-upto for execlists (and multiple rq->ringbuf) > >v8: Rebase > > I am still slightly concerned with the sequential ring req waiting > in combination with optimistic spinning, but other than that looks > good to me: I hear you, I don't yet have a scenario where I care but with a little more refactoring (see next version) extending i915_wait_request to work on an array of requests will be a reasonalbly easy task. > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> Thanks, but I have a new version on its way with minor changes. Spotted an issue with Ironlake and do_idling() as well as slight refactoring. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx