On Thu, Nov 10, 2016 at 11:51:27AM +0000, Tvrtko Ursulin wrote: > > On 10/11/2016 11:11, Chris Wilson wrote: > >On Thu, Nov 10, 2016 at 10:43:29AM +0000, Tvrtko Ursulin wrote: > >> > >>On 07/11/2016 13:59, Chris Wilson wrote: > >>>Defer the transfer from the client's timeline onto the execution > >>>timeline from the point of readiness to the point of actual submission. > >>>For example, in execlists, a request is finally submitted to hardware > >>>when the hardware is ready, and only put onto the hardware queue when > >>>the request is ready. By deferring the transfer, we ensure that the > >>>timeline is maintained in retirement order if we decide to queue the > >>>requests onto the hardware in a different order than fifo. > >>> > >>>v2: Rebased onto distinct global/user timeline lock classes. > >>> > >>>Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >>>--- > >>>drivers/gpu/drm/i915/i915_gem_request.c | 31 +++++++++++++++++------------- > >>>drivers/gpu/drm/i915/i915_gem_request.h | 2 ++ > >>>drivers/gpu/drm/i915/i915_guc_submission.c | 14 +++++++++++++- > >>>drivers/gpu/drm/i915/intel_lrc.c | 23 +++++++++++++--------- > >>>drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++ > >>>5 files changed, 49 insertions(+), 23 deletions(-) > >>> > >>>diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c > >>>index e41d51a68ed8..19c29fafb07a 100644 > >>>--- a/drivers/gpu/drm/i915/i915_gem_request.c > >>>+++ b/drivers/gpu/drm/i915/i915_gem_request.c > >>>@@ -307,25 +307,16 @@ static u32 timeline_get_seqno(struct i915_gem_timeline *tl) > >>> return atomic_inc_return(&tl->next_seqno); > >>>} > >>> > >>>-static int __i915_sw_fence_call > >>>-submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) > >>>+void __i915_gem_request_submit(struct drm_i915_gem_request *request) > >>>{ > >>>- struct drm_i915_gem_request *request = > >>>- container_of(fence, typeof(*request), submit); > >>> struct intel_engine_cs *engine = request->engine; > >>> struct intel_timeline *timeline; > >>>- unsigned long flags; > >>> u32 seqno; > >>> > >>>- if (state != FENCE_COMPLETE) > >>>- return NOTIFY_DONE; > >>>- > >>> /* Transfer from per-context onto the global per-engine timeline */ > >>> timeline = engine->timeline; > >>> GEM_BUG_ON(timeline == request->timeline); > >>>- > >>>- /* Will be called from irq-context when using foreign DMA fences */ > >>>- spin_lock_irqsave(&timeline->lock, flags); > >>>+ assert_spin_locked(&timeline->lock); > >>> > >>> seqno = timeline_get_seqno(timeline->common); > >>> GEM_BUG_ON(!seqno); > >>>@@ -345,15 +336,29 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) > >>> GEM_BUG_ON(!request->global_seqno); > >>> engine->emit_breadcrumb(request, > >>> request->ring->vaddr + request->postfix); > >>>- engine->submit_request(request); > >>> > >>> spin_lock(&request->timeline->lock); > >>> list_move_tail(&request->link, &timeline->requests); > >>> spin_unlock(&request->timeline->lock); > >>> > >>> i915_sw_fence_commit(&request->execute); > >>>+} > >>> > >>>- spin_unlock_irqrestore(&timeline->lock, flags); > >>>+static int __i915_sw_fence_call > >>>+submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) > >>>+{ > >>>+ if (state == FENCE_COMPLETE) { > >>>+ struct drm_i915_gem_request *request = > >>>+ container_of(fence, typeof(*request), submit); > >>>+ struct intel_engine_cs *engine = request->engine; > >>>+ unsigned long flags; > >>>+ > >>>+ /* Will be called from irq-context when using foreign fences. */ > >>>+ spin_lock_irqsave_nested(&engine->timeline->lock, flags, > >>>+ SINGLE_DEPTH_NESTING); > >>>+ engine->submit_request(request); > >>>+ spin_unlock_irqrestore(&engine->timeline->lock, flags); > > > >>Would it be cleaner to move the lock taking to engine->submit_request? > > > >Perhaps. Certainly pushes the ugliness down a layer! > > > >>And is _nested still required? I thought you said it is not. I can't > >>find signalling under the timeline lock either. > > > >It is still required to sort out the ordering between external > >lockclasses (vgem/nouveau/etc) > > Hm, how? I don't see it. :( vgem was triggering a different path but it was the combination of swfences causing the lockdep splat. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx