Jani Nikula <jani.nikula@xxxxxxxxxxxxxxx> writes: > On Thu, 19 Nov 2015, Daniel Vetter <daniel@xxxxxxxx> wrote: >> On Thu, Nov 19, 2015 at 11:41:07AM +0200, Mika Kuoppala wrote: >>> Daniel Vetter <daniel@xxxxxxxx> writes: >>> >>> > On Mon, Nov 02, 2015 at 11:25:08AM +0200, Mika Kuoppala wrote: >>> >> Gen9 has had demonstrated cases where forcing a not ready gpu >>> >> into reset has caused system hang [1]. >>> >> >>> >> Gen8 has never to this date demonstrated such behaviour. >>> >> >>> >> In our CI tests there have been two cases of bsw ending in a state >>> >> where it claims it is not ready for reset, based on reset request, >>> >> after gpu hang [2]. Both of these cases have happened with kernel >>> >> where there are lots of debugs enabled. So it is possible that >>> >> something timing related is at play here like that wait_for_register() >>> >> usleep wakeups collide badly with forcewake. >>> >> >>> >> If we assume that gen8 is safe to reset even with claims of nonreadiness, >>> >> we can alleviate this situations by forcing a reset and revive the gpu. >>> >> Enhance logging so that it will be clear what conditions led to decision >>> >> of proceeding or bailing out, so that we will spot if this way of forcing >>> >> our will against gpu turns out to be foolhardy. >>> >> >>> >> v2: - add bugzilla entry for bsw behaviour (Chris) >>> >> - improve commit message >>> >> >>> >> References [1]: https://bugs.freedesktop.org/show_bug.cgi?id=89959 >>> >> References [2]: https://bugs.freedesktop.org/show_bug.cgi?id=92774 >>> >> Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> >>> >> Cc: Tomi Sarvela <tomix.p.sarvela@xxxxxxxxx> >>> >> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> >>> >> Signed-off-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> >>> >> --- >>> >> drivers/gpu/drm/i915/intel_uncore.c | 9 ++++++++- >>> >> 1 file changed, 8 insertions(+), 1 deletion(-) >>> >> >>> >> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c >>> >> index f0f97b2..47c17f2 100644 >>> >> --- a/drivers/gpu/drm/i915/intel_uncore.c >>> >> +++ b/drivers/gpu/drm/i915/intel_uncore.c >>> >> @@ -1504,7 +1504,14 @@ not_ready: >>> >> I915_WRITE(RING_RESET_CTL(engine->mmio_base), >>> >> _MASKED_BIT_DISABLE(RESET_CTL_REQUEST_RESET)); >>> >> >>> >> - return -EIO; >>> >> + if (INTEL_INFO(dev)->gen == 9) { >>> >> + DRM_ERROR("Reset would risk system stability, bailing out\n"); >>> >> + return -EIO; >>> >> + } >>> >> + >>> >> + DRM_ERROR("Forcing non ready gpu into reset\n"); >>> > >>> > DRM_ERROR will cause dmes-warn failures when running igt. If we agree that >>> > this is ok, it should be at most DRM_INFO. I'd got with DRM_DEBUG_DRIVER. >>> > Ack with that change. >>> >>> This is in the category 'shouldn't never happen' so I would prefer >>> igt to complain loudly if it can't request reset. >>> >>> The real culprit was forcewakes missing on reset path, so >>> this patch can be forgotten and only resurrected if, >>> even with forcewake fix in, gen8 fails to request reset. >>> >>> Here is the patch that fixed the underlying issue: >>> >>> commit 99106bc17e667989b4c0af0a6afcbd6ddbada8fb >>> Author: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> >>> Date: Thu Nov 5 13:11:38 2015 +0200 >>> >>> drm/i915: Do graphics device reset under forcewake >> >> Oh that story. With the above explanation added to the commit message: >> >> Reviewed-by: Daniel Vetter <daniel.vetter@xxxxxxxx> > > Mika, is this patch still relevant? Please repost with the above fixed > if it is. > I think we should forget this and only resurrect it if we ever again see failed resets. As commit 99106bc17e667989b4c0af0a6afcbd6ddbada8fb Author: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> Date: Thu Nov 5 13:11:38 2015 +0200 drm/i915: Do graphics device reset under forcewake should have fixed atleast those that we saw on CI. -Mika > BR, > Jani. > > >> >>> >>> >>> Thanks, >>> -Mika >>> >>> > -Daniel >>> > >>> >> + >>> >> + return gen6_do_reset(dev); >>> >> } >>> >> >>> >> static int (*intel_get_gpu_reset(struct drm_device *dev))(struct drm_device *) >>> >> -- >>> >> 2.5.0 >>> >> >>> > >>> > -- >>> > Daniel Vetter >>> > Software Engineer, Intel Corporation >>> > http://blog.ffwll.ch > > -- > Jani Nikula, Intel Open Source Technology Center _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx