Quoting Tvrtko Ursulin (2020-10-14 09:36:08) > > On 13/10/2020 16:35, Chris Wilson wrote: > > Repeat our sanitychecks from before execution to after execution. One > > expects that if we were to see these, the gpu would already be on fire, > > but the timing may be informative. > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > --- > > drivers/gpu/drm/i915/gt/intel_lrc.c | 10 +++++++--- > > 1 file changed, 7 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c > > index 287537089c77..3dbdd5d0cb60 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > > @@ -1216,7 +1216,8 @@ static void intel_engine_context_out(struct intel_engine_cs *engine) > > > > static void > > execlists_check_context(const struct intel_context *ce, > > - const struct intel_engine_cs *engine) > > + const struct intel_engine_cs *engine, > > + const char *when) > > { > > const struct intel_ring *ring = ce->ring; > > u32 *regs = ce->lrc_reg_state; > > @@ -1251,7 +1252,7 @@ execlists_check_context(const struct intel_context *ce, > > valid = false; > > } > > > > - WARN_ONCE(!valid, "Invalid lrc state found before submission\n"); > > + WARN_ONCE(!valid, "Invalid lrc state found %s submission\n", when); > > } > > > > static void restore_default_state(struct intel_context *ce, > > @@ -1347,7 +1348,7 @@ __execlists_schedule_in(struct i915_request *rq) > > reset_active(rq, engine); > > > > if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) > > - execlists_check_context(ce, engine); > > + execlists_check_context(ce, engine, "before"); > > > > if (ce->tag) { > > /* Use a fixed tag for OA and friends */ > > @@ -1418,6 +1419,9 @@ __execlists_schedule_out(struct i915_request *rq, > > * refrain from doing non-trivial work here. > > */ > > > > + if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) > > + execlists_check_context(ce, engine, "after"); > > + > > CI failures here are either something super scary or a simple mistake > which I cannot see. Or is engine retire, possible queued up before, > racing with current schedule_out? It's the unpark while the process_csb is not yet flushed, so we scrub the kernel_context before it is scheduled-out. It could in theory be a real problem with our scrubbing to simulate an issue causing an issue, but the timing is quite slim. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx