On Thu, May 18, 2017 at 03:51:19PM +0300, Mika Kuoppala wrote: > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > On Thu, May 18, 2017 at 03:34:22PM +0300, Mika Kuoppala wrote: > >> ELK seems to very picky about the preconditions to reset. > >> Evidence on Eaglelake (8086:2e12 (rev 03)) shows that it does > >> not like if reset occurs when there is active ring. > >> > >> Ville found out that there is workaround with name > >> 'WaMediaResetMainRingCleanup' which suggests that we need to > >> cleanup rings before resetting. It is unclear what cleanup > >> exactly means but evidence shows that stopping the ring > >> does have an effect on reset reliability. > >> > >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100998 > >> Suggested-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > >> Cc: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > >> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >> Cc: Tomi Sarvela <tomi.p.sarvela@xxxxxxxxx> > >> Signed-off-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> > >> --- > >> drivers/gpu/drm/i915/intel_uncore.c | 26 ++++++++++++++++++++++++++ > >> 1 file changed, 26 insertions(+) > >> > >> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c > >> index 7eaaf2225e1a..1d473cd1f8a4 100644 > >> --- a/drivers/gpu/drm/i915/intel_uncore.c > >> +++ b/drivers/gpu/drm/i915/intel_uncore.c > >> @@ -1427,6 +1427,26 @@ int i915_reg_read_ioctl(struct drm_device *dev, > >> return ret; > >> } > >> > >> +static void gen3_stop_rings(struct drm_i915_private *dev_priv) > >> +{ > >> + struct intel_engine_cs *engine; > >> + enum intel_engine_id id; > >> + > >> + for_each_engine(engine, dev_priv, id) { > >> + const i915_reg_t mode_r = RING_MI_MODE(engine->mmio_base); > >> + > >> + I915_WRITE_FW(mode_r, _MASKED_BIT_ENABLE(STOP_RING)); > > > > This will only stop between instructions on the CS (i.e. within the > > ring) aiui . If we did hang in a shader, this is not going to achieve > > very much. > > The subject is too grandiose then. Should I add to the subject 'basic > hangs'? > > This seems to help get our hang tests through, bringing more coverage. > I pondered adding ctl, head and tail writes to zero. It would not help > with shades, but should not hurt either. > > > > >> + if (intel_wait_for_register_fw(dev_priv, > >> + mode_r, > >> + MODE_IDLE, > >> + MODE_IDLE, > >> + 1000)) { > >> + DRM_DEBUG("%s : timed out STOP_RING\n", > >> + engine->name); > > > > If we make this a DRM_ERROR I expect it to fire through the hang tests > > in BAT. > > Why would stopping the ring be a problem with a clean chained batch? > Well I try to see how it goes. My understanding is that the STOP_RING only takes affect on the ring itself, our chained batches stay away from the ring. At least that's the problem I encountered when trying to use STOP_RING + SYNC_FLUSH for seqno flushing. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx