Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > On Fri, Oct 30, 2015 at 04:43:49PM +0200, Mika Kuoppala wrote: >> Gen9 has had demonstrated cases where forcing a not ready gpu >> into reset has caused system hang [1]. >> >> Gen8 has never to this date demonstrated such behaviour. >> >> In our CI tests bsw sometimes ends up in a state where it claims it >> is not ready for reset, based on reset request, after gpu hang. >> >> Allow gen8 to reset even after claims of nonreadiness in order >> to keep the gpu accessible. Enhance logging so that it will be >> clear what conditions led to decision of proceeding or bailing out, >> so that we will spot if this way of forcing our will against gpu turns >> out to be foolhardy. >> >> References [1]: https://bugs.freedesktop.org/show_bug.cgi?id=89959 >> Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> >> Cc: Tomi Sarvela <tomix.p.sarvela@xxxxxxxxx> >> Signed-off-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> >> --- >> drivers/gpu/drm/i915/intel_uncore.c | 9 ++++++++- >> 1 file changed, 8 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c >> index f0f97b2..47c17f2 100644 >> --- a/drivers/gpu/drm/i915/intel_uncore.c >> +++ b/drivers/gpu/drm/i915/intel_uncore.c >> @@ -1504,7 +1504,14 @@ not_ready: >> I915_WRITE(RING_RESET_CTL(engine->mmio_base), >> _MASKED_BIT_DISABLE(RESET_CTL_REQUEST_RESET)); >> >> - return -EIO; > > Where's the reference for where we hit this EIO on gen8? > Internal CI logs, relevant part cutpasted below. If you want full log holler me in irc. [ 119.147727] kms_pipe_crc_basic: starting subtest hang-read-crc-pipe-A [ 124.785063] [drm] stuck on render ring [ 124.800850] [drm] GPU HANG: ecode 8:0:0xfffffffe, in kms_pipe_crc_ba [5590], reason: Ring hung, action: reset [ 124.801154] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 124.801161] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 124.801167] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 124.801173] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 124.801179] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 124.801785] kobject: 'card0' (ffff880174ad92a0): kobject_uevent_env [ 124.801940] kobject: 'card0' (ffff880174ad92a0): fill_kobj_path: path = '/devices/pci0000:00/0000:00:02.0/drm/card0' [ 124.805032] kobject: 'card0' (ffff880174ad92a0): kobject_uevent_env [ 124.805089] kobject: 'card0' (ffff880174ad92a0): fill_kobj_path: path = '/devices/pci0000:00/0000:00:02.0/drm/card0' [ 125.511744] [drm:gen8_do_reset [i915]] *ERROR* render ring: reset request timeout [ 125.511922] [drm] Simulated gpu hang, resetting stop_rings [ 125.511927] drm/i915: Resetting chip after gpu hang [ 125.511954] [drm:i915_reset [i915]] *ERROR* Failed to reset chip: -5 [ 125.637612] kms_pipe_crc_basic: exiting, ret=0 [ 125.653608] [drm:intel_lr_context_deferred_alloc [i915]] *ERROR* ring create req: -5 [ 125.847695] gem_ctx_param_basic: executing [ 125.850086] [drm:intel_lr_context_deferred_alloc [i915]] *ERROR* ring create req: -5 [ 125.854482] gem_ctx_param_basic: exiting, ret=99 [ 126.038693] kms_addfb_basic: executing [ 126.041754] [drm:intel_lr_context_deferred_alloc [i915]] *ERROR* ring create req: -5 -Mika >> + if (INTEL_INFO(dev)->gen == 9) { >> + DRM_ERROR("Reset would risk system stability, bailing out\n"); >> + return -EIO; >> + } >> + >> + DRM_ERROR("Forcing non ready gpu into reset\n"); >> + >> + return gen6_do_reset(dev); >> } >> >> static int (*intel_get_gpu_reset(struct drm_device *dev))(struct drm_device *) >> -- >> 2.5.0 >> >> _______________________________________________ >> Intel-gfx mailing list >> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx > > -- > Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx