When WaIdleLiteRestore isn't enough. Fixes an odd hang on gen8 (both bsw and bdw) during gem_ctx_switch, where by all intents and purposes if we trigger a lite-restore as it is processing the pipecontrol flushes, the RING is restored to the oword following the command and tries to execute the destination address for the pipecontrol rather than a valid command. With the theory being that it doesn't like RING_HEAD being within a cacheline of the restored RING_TAIL, we can evade that issue by not triggering a lite-restore if we know we are inside the last request. Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> --- drivers/gpu/drm/i915/intel_lrc.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 029901a8fa38..5c50263e45d3 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -639,6 +639,19 @@ static void execlists_dequeue(struct intel_engine_cs *engine) if (port_count(&port[1])) goto unlock; + /* + * Skip invoking a lite-restore if we know we have already + * started processing the last request queued to HW. This + * prevents a mystery *unrecoverable* hang on gen8, maybe + * related to updating TAIL within a cacheline of HEAD? (As + * there is still a delay between submitting the ESLP update + * and HW responding, we may still encounter whatever condition + * trips up, just less often.) + */ + if (i915_seqno_passed(intel_engine_get_seqno(engine), + last->global_seqno - 1)) + goto unlock; + /* * WaIdleLiteRestore:bdw,skl * Apply the wa NOOPs to prevent -- 2.17.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx