Re: [PATCH v2] drm/i915: Stop engines when declaring the machine wedged

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Mika Kuoppala (2018-03-16 08:58:28)
> Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes:
> 
> > If we fail to reset the GPU, we declare the machine wedged. However, the
> > GPU may well still be running in the background with an in-flight
> > request. So despite our efforts in cleaning up the request queue and
> > faking the breadcrumb in the HWSP, the GPU may eventually write the
> > in-flght seqno there breaking all of our assumptions and throwing the
> > driver into a deep turmoil, wedging beyond wedged.
> >
> > To avoid this we ideally want to reset the GPU. Since that has already
> > failed, make sure the rings have the stop bit set instead. This is part
> > of the normal GPU reset sequence, but that is actually disabled by
> > igt/gem_eio to force the wedged state. If we assume the worst, we must
> > poke at the bit again before we give up.
> >
> > v2: Move the intel_gpu_reset() from set-wedged in the reset error path
> > into i915_gem_set_wedged() itself. Even if the reset fails (e.g. if it is
> > disabled by gem_eio), it still tries to make sure the engines are
> > stopped. For i915_gem_set_wedged() callers from outside of i915_reset(),
> > this should make sure the GPU is disabled while the driver is marked as
> > being wedged.
> >
> > Testcase: igt/gem_eio
> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx>
> > Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx>
> > Cc: Michal Wajdeczko <michal.wajdeczko@xxxxxxxxx>
> > Cc: Michel Thierry <michel.thierry@xxxxxxxxx>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c | 1 -
> >  drivers/gpu/drm/i915/i915_gem.c | 3 +++
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index f03555efc520..3df5193487f3 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -1995,7 +1995,6 @@ void i915_reset(struct drm_i915_private *i915, unsigned int flags)
> >  error:
> >       i915_gem_set_wedged(i915);
> >       i915_retire_requests(i915);
> > -     intel_gpu_reset(i915, ALL_ENGINES);
> >       goto finish;
> >  }
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 2fbd622bba30..802df8e1a544 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3246,6 +3246,9 @@ void i915_gem_set_wedged(struct drm_i915_private *i915)
> >       }
> >       i915->caps.scheduler = 0;
> >  
> > +     /* Even if the GPU reset fails, it should still stop the engines */
> > +     intel_gpu_reset(i915, ALL_ENGINES);
> > +
> 
> Comment is very welcome in here as modparm.reset usage isn't
> so transparent.
> 
> Reviewed-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx>

Ta, gem_eio tamed, hopefully!
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux