Re: [Intel-gfx] [PATCH] drm/i915: Keep all engine locks across scheduling

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Mon, 27 Mar 2017 22:06:45 +0100

On Mon, Mar 27, 2017 at 11:11:47AM +0100, Tvrtko Ursulin wrote:
> 
> On 26/03/2017 09:46, Chris Wilson wrote:
> >Unlocking is dangerous. In this case we combine an early update to the
> >out-of-queue request, because we know that it will be inserted into the
> >correct FIFO priority-ordered slot when it becomes ready in the future.
> >However, given sufficient enthusiasm, it may become ready as we are
> >continuing to reschedule, and so may gazump the FIFO if we have since
> >dropped its spinlock. The result is that it may be executed too early,
> >before its dependees.
> >
> >Fixes: 20311bd35060 ("drm/i915/scheduler: Execute requests in order of priorities")
> >Testcase: igt/gem_exec_whisper
> >Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> >Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
> >Cc: <stable@xxxxxxxxxxxxxxx> # v4.10+
> >---
> > drivers/gpu/drm/i915/intel_lrc.c | 54 +++++++++++++++++++++++++++-------------
> > 1 file changed, 37 insertions(+), 17 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> >index dd0e9d587852..3fdabba0a32d 100644
> >--- a/drivers/gpu/drm/i915/intel_lrc.c
> >+++ b/drivers/gpu/drm/i915/intel_lrc.c
> >@@ -658,30 +658,47 @@ static void execlists_submit_request(struct drm_i915_gem_request *request)
> > 	spin_unlock_irqrestore(&engine->timeline->lock, flags);
> > }
> >
> >-static struct intel_engine_cs *
> >-pt_lock_engine(struct i915_priotree *pt, struct intel_engine_cs *locked)
> >+static inline struct intel_engine_cs *
> >+pt_lock_engine(struct i915_priotree *pt, unsigned long *locked)
> > {
> >-	struct intel_engine_cs *engine;
> >-
> >-	engine = container_of(pt,
> >-			      struct drm_i915_gem_request,
> >-			      priotree)->engine;
> >-	if (engine != locked) {
> >-		if (locked)
> >-			spin_unlock_irq(&locked->timeline->lock);
> >-		spin_lock_irq(&engine->timeline->lock);
> >-	}
> >+	struct intel_engine_cs *engine =
> >+		container_of(pt, struct drm_i915_gem_request, priotree)->engine;
> >+
> >+	/* Locking the engines in a random order will rightfully trigger a
> >+	 * spasm in lockdep. However, we can ignore lockdep (by marking each
> >+	 * as a seperate nesting) so long as we never nest the
> >+	 * engine->timeline->lock elsewhere. Also the number of nesting
> >+	 * subclasses is severely limited (7) which is going to cause an
> >+	 * issue at some point.
> >+	 * BUILD_BUG_ON(I915_NUM_ENGINES >= MAX_LOCKDEP_SUBCLASSES);
> 
> Lets bite the bullet and not hide this BUILD_BUG_ON in a comment. :I

Another option appears to be to disable lockdep for the global engine locks:

diff --git a/drivers/gpu/drm/i915/i915_gem_timeline.c b/drivers/gpu/drm/i915/i91
index b596ca7..f1b7dbe 100644
--- a/drivers/gpu/drm/i915/i915_gem_timeline.c
+++ b/drivers/gpu/drm/i915/i915_gem_timeline.c
@@ -73,12 +73,11 @@ int i915_gem_timeline_init(struct drm_i915_private *i915,
 
 int i915_gem_timeline_init__global(struct drm_i915_private *i915)
 {
-       static struct lock_class_key class;
-
        return __i915_gem_timeline_init(i915,
                                        &i915->gt.global_timeline,
                                        "[execution]",
-                                       &class, "&global_timeline->lock");
+                                       &__lockdep_no_validate__,
+                                       "&global_timeline->lock");
 }

Keeping the shortcut does speed up the rescheduling, but we still have
the icky nesting that requires a fat comment and games with lockdep.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre