On Mon, Dec 19, 2016 at 04:24:18PM +0200, Mika Kuoppala wrote: > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > On Fri, Dec 16, 2016 at 12:20:05PM -0800, Michel Thierry wrote: > >> From: Arun Siluvery <arun.siluvery@xxxxxxxxxxxxxxx> > >> > >> This change implements support for per-engine reset as an initial, less > >> intrusive hang recovery option to be attempted before falling back to the > >> legacy full GPU reset recovery mode if necessary. This is only supported > >> from Gen8 onwards. > >> > >> Hangchecker determines which engines are hung and invokes error handler to > >> recover from it. Error handler schedules recovery for each of those engines > >> that are hung. The recovery procedure is as follows, > >> - identifies the request that caused the hang and it is dropped > >> - force engine to idle: this is done by issuing a reset request > >> - reset and re-init engine > >> - restart submissions to the engine > >> > >> If engine reset fails then we fall back to heavy weight full gpu reset > >> which resets all engines and reinitiazes complete state of HW and SW. > >> > >> v2: Rebase. > >> > >> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >> Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > >> Signed-off-by: Tomas Elf <tomas.elf@xxxxxxxxx> > >> Signed-off-by: Arun Siluvery <arun.siluvery@xxxxxxxxxxxxxxx> > >> Signed-off-by: Michel Thierry <michel.thierry@xxxxxxxxx> > >> --- > >> drivers/gpu/drm/i915/i915_drv.c | 56 +++++++++++++++++++++++++++++++++++-- > >> drivers/gpu/drm/i915/i915_drv.h | 3 ++ > >> drivers/gpu/drm/i915/i915_gem.c | 2 +- > >> drivers/gpu/drm/i915/intel_lrc.c | 12 ++++++++ > >> drivers/gpu/drm/i915/intel_lrc.h | 1 + > >> drivers/gpu/drm/i915/intel_uncore.c | 41 ++++++++++++++++++++++++--- > >> 6 files changed, 108 insertions(+), 7 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > >> index e5688edd62cd..a034793bc246 100644 > >> --- a/drivers/gpu/drm/i915/i915_drv.c > >> +++ b/drivers/gpu/drm/i915/i915_drv.c > >> @@ -1830,18 +1830,70 @@ void i915_reset(struct drm_i915_private *dev_priv) > >> * > >> * Reset a specific GPU engine. Useful if a hang is detected. > >> * Returns zero on successful reset or otherwise an error code. > >> + * > >> + * Procedure is fairly simple: > >> + * - identifies the request that caused the hang and it is dropped > >> + * - force engine to idle: this is done by issuing a reset request > >> + * - reset engine > >> + * - restart submissions to the engine > >> */ > >> int i915_reset_engine(struct intel_engine_cs *engine) > > > > What's the serialisation between potential callers of > > i915_reset_engine()? > > > > I feel that making 'reset_in_progress' per engine feature > would clarify this and would be more fitting as now it is > the one engine that can be in reset at particular point in time, > decoupled with others. That is not what "reset_in_progress" means. It principally means MUTEX_BACKOFF. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx