On Tue, Dec 19, 2017 at 01:14:19PM +0000, Chris Wilson wrote: > Useful for verifying our bookkeeper when we encounter is knowing whether > we think the engine is idle at the time of the GPU hang. > > References: https://bugs.freedesktop.org/show_bug.cgi?id=104305 Here you mention the hang as "false positive"... if it is a false positive and we have this idle information shouldn't we handle this differently instead of trowing the error information and reseting the GPU? Or am I missunderstanding what you meant with "false positive"? > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > Cc: Michal Wajdeczko <michal.wajdeczko@xxxxxxxxx> Anyways the info here seems interresting so Reviewed-by: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> > --- > drivers/gpu/drm/i915/i915_drv.h | 1 + > drivers/gpu/drm/i915/i915_gpu_error.c | 2 ++ > 2 files changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 1aba5657f5f0..8ca836851365 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -948,6 +948,7 @@ struct i915_gpu_state { > struct drm_i915_error_engine { > int engine_id; > /* Software tracked state */ > + bool idle; > bool waiting; > int num_waiters; > unsigned long hangcheck_timestamp; > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > index aba50aa613f1..50feec87c3a3 100644 > --- a/drivers/gpu/drm/i915/i915_gpu_error.c > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c > @@ -416,6 +416,7 @@ static void error_print_engine(struct drm_i915_error_state_buf *m, > int n; > > err_printf(m, "%s command stream:\n", engine_str(ee->engine_id)); > + err_printf(m, " IDLE?: %s\n", yesno(ee->idle)); > err_printf(m, " START: 0x%08x\n", ee->start); > err_printf(m, " HEAD: 0x%08x [0x%08x]\n", ee->head, ee->rq_head); > err_printf(m, " TAIL: 0x%08x [0x%08x, 0x%08x]\n", > @@ -1256,6 +1257,7 @@ static void error_record_engine_registers(struct i915_gpu_state *error, > ee->hws = I915_READ(mmio); > } > > + ee->idle = intel_engine_is_idle(engine); > ee->hangcheck_timestamp = engine->hangcheck.action_timestamp; > ee->hangcheck_action = engine->hangcheck.action; > ee->hangcheck_stalled = engine->hangcheck.stalled; > -- > 2.15.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx