Hi Chris, On Mon, Jan 04, 2021 at 11:51:42AM +0000, Chris Wilson wrote: > If the engine reset fails, we will attempt to resume with the current > inflight submissions. When that happens, we cannot assert that the > engine reset cleared the pending submission, so do not. > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2878 > Fixes: 16f2941ad307 ("drm/i915/gt: Replace direct submit with direct call to tasklet") > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > --- > drivers/gpu/drm/i915/gt/intel_engine_types.h | 2 + > .../drm/i915/gt/intel_execlists_submission.c | 6 +- > drivers/gpu/drm/i915/gt/intel_reset.c | 3 + > drivers/gpu/drm/i915/gt/selftest_execlists.c | 75 +++++++++++++++++++ > 4 files changed, 85 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h > index c28f4e190fe6..430066e5884c 100644 > --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h > +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h > @@ -561,6 +561,8 @@ struct intel_engine_cs { > unsigned long stop_timeout_ms; > unsigned long timeslice_duration_ms; > } props, defaults; > + > + I915_SELFTEST_DECLARE(struct fault_attr reset_timeout); > }; > > static inline bool > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > index 2afbc0a4ca03..f02e3ae10d28 100644 > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > @@ -3047,9 +3047,13 @@ static void execlists_reset_finish(struct intel_engine_cs *engine) > * After a GPU reset, we may have requests to replay. Do so now while > * we still have the forcewake to be sure that the GPU is not allowed > * to sleep before we restart and reload a context. > + * > + * If the GPU reset fails, the engine may still be alive with requests > + * inflight. We expect those to complete, or for the device to be > + * reset as the next level of recovery, and as a final resort we > + * will declare the device wedged. > */ > GEM_BUG_ON(!reset_in_progress(execlists)); > - GEM_BUG_ON(engine->execlists.pending[0]); I would have split this in two patches, but it looks good anyway. Reviewed-by: Andi Shyti <andi.shyti@xxxxxxxxx> Thanks, Andi _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx