On Thu, Nov 19, 2015 at 11:41:07AM +0200, Mika Kuoppala wrote: > Daniel Vetter <daniel@xxxxxxxx> writes: > > > On Mon, Nov 02, 2015 at 11:25:08AM +0200, Mika Kuoppala wrote: > >> Gen9 has had demonstrated cases where forcing a not ready gpu > >> into reset has caused system hang [1]. > >> > >> Gen8 has never to this date demonstrated such behaviour. > >> > >> In our CI tests there have been two cases of bsw ending in a state > >> where it claims it is not ready for reset, based on reset request, > >> after gpu hang [2]. Both of these cases have happened with kernel > >> where there are lots of debugs enabled. So it is possible that > >> something timing related is at play here like that wait_for_register() > >> usleep wakeups collide badly with forcewake. > >> > >> If we assume that gen8 is safe to reset even with claims of nonreadiness, > >> we can alleviate this situations by forcing a reset and revive the gpu. > >> Enhance logging so that it will be clear what conditions led to decision > >> of proceeding or bailing out, so that we will spot if this way of forcing > >> our will against gpu turns out to be foolhardy. > >> > >> v2: - add bugzilla entry for bsw behaviour (Chris) > >> - improve commit message > >> > >> References [1]: https://bugs.freedesktop.org/show_bug.cgi?id=89959 > >> References [2]: https://bugs.freedesktop.org/show_bug.cgi?id=92774 > >> Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> > >> Cc: Tomi Sarvela <tomix.p.sarvela@xxxxxxxxx> > >> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >> Signed-off-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> > >> --- > >> drivers/gpu/drm/i915/intel_uncore.c | 9 ++++++++- > >> 1 file changed, 8 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c > >> index f0f97b2..47c17f2 100644 > >> --- a/drivers/gpu/drm/i915/intel_uncore.c > >> +++ b/drivers/gpu/drm/i915/intel_uncore.c > >> @@ -1504,7 +1504,14 @@ not_ready: > >> I915_WRITE(RING_RESET_CTL(engine->mmio_base), > >> _MASKED_BIT_DISABLE(RESET_CTL_REQUEST_RESET)); > >> > >> - return -EIO; > >> + if (INTEL_INFO(dev)->gen == 9) { > >> + DRM_ERROR("Reset would risk system stability, bailing out\n"); > >> + return -EIO; > >> + } > >> + > >> + DRM_ERROR("Forcing non ready gpu into reset\n"); > > > > DRM_ERROR will cause dmes-warn failures when running igt. If we agree that > > this is ok, it should be at most DRM_INFO. I'd got with DRM_DEBUG_DRIVER. > > Ack with that change. > > This is in the category 'shouldn't never happen' so I would prefer > igt to complain loudly if it can't request reset. > > The real culprit was forcewakes missing on reset path, so > this patch can be forgotten and only resurrected if, > even with forcewake fix in, gen8 fails to request reset. > > Here is the patch that fixed the underlying issue: > > commit 99106bc17e667989b4c0af0a6afcbd6ddbada8fb > Author: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > Date: Thu Nov 5 13:11:38 2015 +0200 > > drm/i915: Do graphics device reset under forcewake Oh that story. With the above explanation added to the commit message: Reviewed-by: Daniel Vetter <daniel.vetter@xxxxxxxx> > > > Thanks, > -Mika > > > -Daniel > > > >> + > >> + return gen6_do_reset(dev); > >> } > >> > >> static int (*intel_get_gpu_reset(struct drm_device *dev))(struct drm_device *) > >> -- > >> 2.5.0 > >> > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx