On Tue, Jun 16, 2015 at 04:39:23PM +0300, Mika Kuoppala wrote: > In order for skl+ hardware to guarantee that no context switch > takes place during reset and that current context is properly > saved, the driver needs to notify and query hw before commencing > with reset. > > We will only proceed with reset if all engines report that they > are ready for reset. > > As we skip the reset if any single engine reports not ready, this > commit prevents system hang skl in some situations where the > gpu/blitter is hanged and in such state that any write to generic s/is hanged/is wedged/ reads better > reset register (GEN6_GDRST) causes immediate system hang. > > References: https://bugs.freedesktop.org/show_bug.cgi?id=89959 > References: https://bugs.freedesktop.org/show_bug.cgi?id=90854 > Signed-off-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> > --- > drivers/gpu/drm/i915/i915_reg.h | 3 +++ > drivers/gpu/drm/i915/intel_uncore.c | 32 +++++++++++++++++++++++++++++++- > 2 files changed, 34 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > index 0b979ad..3684f92 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -1461,6 +1461,9 @@ enum skl_disp_power_wells { > #define RING_MAX_IDLE(base) ((base)+0x54) > #define RING_HWS_PGA(base) ((base)+0x80) > #define RING_HWS_PGA_GEN6(base) ((base)+0x2080) > +#define RING_RESET_CTL(base) ((base)+0xd0) > +#define RESET_CTL_REQUEST_RESET (1 << 0) > +#define RESET_CTL_READY_TO_RESET (1 << 1) > > #define HSW_GTT_CACHE_EN 0x4024 > #define GTT_CACHE_EN_ALL 0xF0007FFF > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c > index 4a86cf0..404bce2 100644 > --- a/drivers/gpu/drm/i915/intel_uncore.c > +++ b/drivers/gpu/drm/i915/intel_uncore.c > @@ -1455,9 +1455,39 @@ static int gen6_do_reset(struct drm_device *dev) > return ret; > } > > +static int wait_for_bits_set(struct drm_i915_private *dev_priv, > + const u32 reg, const u32 mask, const int timeout) > +{ > + return wait_for((I915_READ(reg) & mask) == mask, timeout); > +} > + > +static int gen9_do_reset(struct drm_device *dev) > +{ > + struct drm_i915_private *dev_priv = dev->dev_private; > + struct intel_engine_cs *engine; > + int ret, i; > + > + for_each_ring(engine, dev_priv, i) { > + I915_WRITE(RING_RESET_CTL(engine->mmio_base), > + _MASKED_BIT_ENABLE(RESET_CTL_REQUEST_RESET)); > + > + ret = wait_for_bits_set(dev_priv, > + RING_RESET_CTL(engine->mmio_base), > + RESET_CTL_READY_TO_RESET, 700); > + if (ret) { > + DRM_ERROR("%s: reset request timeout\n", engine->name); > + return -ENODEV; return -EIO; since the reset didn't happen due to hardware issues (ENODEV is that we don't have the implementation for the GPU rather than it failed). Do we need any recovery? Do you guarrantee that the GPU reset resets the CTL register? -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx