On Mon, Mar 27, 2017 at 11:11:47AM +0100, Tvrtko Ursulin wrote: > > On 26/03/2017 09:46, Chris Wilson wrote: > >Unlocking is dangerous. In this case we combine an early update to the > >out-of-queue request, because we know that it will be inserted into the > >correct FIFO priority-ordered slot when it becomes ready in the future. > >However, given sufficient enthusiasm, it may become ready as we are > >continuing to reschedule, and so may gazump the FIFO if we have since > >dropped its spinlock. The result is that it may be executed too early, > >before its dependees. > > > >Fixes: 20311bd35060 ("drm/i915/scheduler: Execute requests in order of priorities") > >Testcase: igt/gem_exec_whisper > >Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >Cc: <stable@xxxxxxxxxxxxxxx> # v4.10+ > >--- > > drivers/gpu/drm/i915/intel_lrc.c | 54 +++++++++++++++++++++++++++------------- > > 1 file changed, 37 insertions(+), 17 deletions(-) > > > >diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > >index dd0e9d587852..3fdabba0a32d 100644 > >--- a/drivers/gpu/drm/i915/intel_lrc.c > >+++ b/drivers/gpu/drm/i915/intel_lrc.c > >@@ -658,30 +658,47 @@ static void execlists_submit_request(struct drm_i915_gem_request *request) > > spin_unlock_irqrestore(&engine->timeline->lock, flags); > > } > > > >-static struct intel_engine_cs * > >-pt_lock_engine(struct i915_priotree *pt, struct intel_engine_cs *locked) > >+static inline struct intel_engine_cs * > >+pt_lock_engine(struct i915_priotree *pt, unsigned long *locked) > > { > >- struct intel_engine_cs *engine; > >- > >- engine = container_of(pt, > >- struct drm_i915_gem_request, > >- priotree)->engine; > >- if (engine != locked) { > >- if (locked) > >- spin_unlock_irq(&locked->timeline->lock); > >- spin_lock_irq(&engine->timeline->lock); > >- } > >+ struct intel_engine_cs *engine = > >+ container_of(pt, struct drm_i915_gem_request, priotree)->engine; > >+ > >+ /* Locking the engines in a random order will rightfully trigger a > >+ * spasm in lockdep. However, we can ignore lockdep (by marking each > >+ * as a seperate nesting) so long as we never nest the > >+ * engine->timeline->lock elsewhere. Also the number of nesting > >+ * subclasses is severely limited (7) which is going to cause an > >+ * issue at some point. > >+ * BUILD_BUG_ON(I915_NUM_ENGINES >= MAX_LOCKDEP_SUBCLASSES); > > Lets bite the bullet and not hide this BUILD_BUG_ON in a comment. :I Another option appears to be to disable lockdep for the global engine locks: diff --git a/drivers/gpu/drm/i915/i915_gem_timeline.c b/drivers/gpu/drm/i915/i91 index b596ca7..f1b7dbe 100644 --- a/drivers/gpu/drm/i915/i915_gem_timeline.c +++ b/drivers/gpu/drm/i915/i915_gem_timeline.c @@ -73,12 +73,11 @@ int i915_gem_timeline_init(struct drm_i915_private *i915, int i915_gem_timeline_init__global(struct drm_i915_private *i915) { - static struct lock_class_key class; - return __i915_gem_timeline_init(i915, &i915->gt.global_timeline, "[execution]", - &class, "&global_timeline->lock"); + &__lockdep_no_validate__, + "&global_timeline->lock"); } Keeping the shortcut does speed up the rescheduling, but we still have the icky nesting that requires a fat comment and games with lockdep. -Chris -- Chris Wilson, Intel Open Source Technology Centre