Quoting Mika Kuoppala (2018-03-16 08:58:28) > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > If we fail to reset the GPU, we declare the machine wedged. However, the > > GPU may well still be running in the background with an in-flight > > request. So despite our efforts in cleaning up the request queue and > > faking the breadcrumb in the HWSP, the GPU may eventually write the > > in-flght seqno there breaking all of our assumptions and throwing the > > driver into a deep turmoil, wedging beyond wedged. > > > > To avoid this we ideally want to reset the GPU. Since that has already > > failed, make sure the rings have the stop bit set instead. This is part > > of the normal GPU reset sequence, but that is actually disabled by > > igt/gem_eio to force the wedged state. If we assume the worst, we must > > poke at the bit again before we give up. > > > > v2: Move the intel_gpu_reset() from set-wedged in the reset error path > > into i915_gem_set_wedged() itself. Even if the reset fails (e.g. if it is > > disabled by gem_eio), it still tries to make sure the engines are > > stopped. For i915_gem_set_wedged() callers from outside of i915_reset(), > > this should make sure the GPU is disabled while the driver is marked as > > being wedged. > > > > Testcase: igt/gem_eio > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > > Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx> > > Cc: Michal Wajdeczko <michal.wajdeczko@xxxxxxxxx> > > Cc: Michel Thierry <michel.thierry@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/i915_drv.c | 1 - > > drivers/gpu/drm/i915/i915_gem.c | 3 +++ > > 2 files changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > > index f03555efc520..3df5193487f3 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.c > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > @@ -1995,7 +1995,6 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags) > > error: > > i915_gem_set_wedged(i915); > > i915_retire_requests(i915); > > - intel_gpu_reset(i915, ALL_ENGINES); > > goto finish; > > } > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > > index 2fbd622bba30..802df8e1a544 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -3246,6 +3246,9 @@ void i915_gem_set_wedged(struct drm_i915_private *i915) > > } > > i915->caps.scheduler = 0; > > > > + /* Even if the GPU reset fails, it should still stop the engines */ > > + intel_gpu_reset(i915, ALL_ENGINES); > > + > > Comment is very welcome in here as modparm.reset usage isn't > so transparent. > > Reviewed-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> Ta, gem_eio tamed, hopefully! -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx