On Thu, 19 Nov 2015, Daniel Vetter <daniel@xxxxxxxx> wrote: > On Thu, Nov 19, 2015 at 11:41:07AM +0200, Mika Kuoppala wrote: >> Daniel Vetter <daniel@xxxxxxxx> writes: >> >> > On Mon, Nov 02, 2015 at 11:25:08AM +0200, Mika Kuoppala wrote: >> >> Gen9 has had demonstrated cases where forcing a not ready gpu >> >> into reset has caused system hang [1]. >> >> >> >> Gen8 has never to this date demonstrated such behaviour. >> >> >> >> In our CI tests there have been two cases of bsw ending in a state >> >> where it claims it is not ready for reset, based on reset request, >> >> after gpu hang [2]. Both of these cases have happened with kernel >> >> where there are lots of debugs enabled. So it is possible that >> >> something timing related is at play here like that wait_for_register() >> >> usleep wakeups collide badly with forcewake. >> >> >> >> If we assume that gen8 is safe to reset even with claims of nonreadiness, >> >> we can alleviate this situations by forcing a reset and revive the gpu. >> >> Enhance logging so that it will be clear what conditions led to decision >> >> of proceeding or bailing out, so that we will spot if this way of forcing >> >> our will against gpu turns out to be foolhardy. >> >> >> >> v2: - add bugzilla entry for bsw behaviour (Chris) >> >> - improve commit message >> >> >> >> References [1]: https://bugs.freedesktop.org/show_bug.cgi?id=89959 >> >> References [2]: https://bugs.freedesktop.org/show_bug.cgi?id=92774 >> >> Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> >> >> Cc: Tomi Sarvela <tomix.p.sarvela@xxxxxxxxx> >> >> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> >> >> Signed-off-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> >> >> --- >> >> drivers/gpu/drm/i915/intel_uncore.c | 9 ++++++++- >> >> 1 file changed, 8 insertions(+), 1 deletion(-) >> >> >> >> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c >> >> index f0f97b2..47c17f2 100644 >> >> --- a/drivers/gpu/drm/i915/intel_uncore.c >> >> +++ b/drivers/gpu/drm/i915/intel_uncore.c >> >> @@ -1504,7 +1504,14 @@ not_ready: >> >> I915_WRITE(RING_RESET_CTL(engine->mmio_base), >> >> _MASKED_BIT_DISABLE(RESET_CTL_REQUEST_RESET)); >> >> >> >> - return -EIO; >> >> + if (INTEL_INFO(dev)->gen == 9) { >> >> + DRM_ERROR("Reset would risk system stability, bailing out\n"); >> >> + return -EIO; >> >> + } >> >> + >> >> + DRM_ERROR("Forcing non ready gpu into reset\n"); >> > >> > DRM_ERROR will cause dmes-warn failures when running igt. If we agree that >> > this is ok, it should be at most DRM_INFO. I'd got with DRM_DEBUG_DRIVER. >> > Ack with that change. >> >> This is in the category 'shouldn't never happen' so I would prefer >> igt to complain loudly if it can't request reset. >> >> The real culprit was forcewakes missing on reset path, so >> this patch can be forgotten and only resurrected if, >> even with forcewake fix in, gen8 fails to request reset. >> >> Here is the patch that fixed the underlying issue: >> >> commit 99106bc17e667989b4c0af0a6afcbd6ddbada8fb >> Author: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> >> Date: Thu Nov 5 13:11:38 2015 +0200 >> >> drm/i915: Do graphics device reset under forcewake > > Oh that story. With the above explanation added to the commit message: > > Reviewed-by: Daniel Vetter <daniel.vetter@xxxxxxxx> Mika, is this patch still relevant? Please repost with the above fixed if it is. BR, Jani. > >> >> >> Thanks, >> -Mika >> >> > -Daniel >> > >> >> + >> >> + return gen6_do_reset(dev); >> >> } >> >> >> >> static int (*intel_get_gpu_reset(struct drm_device *dev))(struct drm_device *) >> >> -- >> >> 2.5.0 >> >> >> > >> > -- >> > Daniel Vetter >> > Software Engineer, Intel Corporation >> > http://blog.ffwll.ch -- Jani Nikula, Intel Open Source Technology Center _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx