During a reset, we may skip over completed requests and lost context-switch interrupts. Following the reset, we may then may end up with no active requests in the ELSP (and so do not resubmit to restart the engine), but have a queue of requests ready for execution. This is unlikely, it requires the last request to complete after the hang is detected, but not impossible. The outcome of this is that the engine stalls, possibly leading to full ring and indefinite wait under struct_mutex, eventually leading to a full driver hang. Alternatively, we can solve this by unsubmitting the incomplete requests and just kickstarting the tasklet. Michał has patches for that, which I initially disliked due to the extra complexity, but the complexity of this "simple" restart is growing... Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx> Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> Link: https://patchwork.freedesktop.org/patch/msgid/20170916204414.32762-1-chris@xxxxxxxxxxxxxxxxxx Reviewed-by: Michał Winiarski <michal.winiarski@xxxxxxxxx> --- drivers/gpu/drm/i915/intel_lrc.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 8e5caa5d3973..0f578f76f79f 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1357,8 +1357,12 @@ static int gen8_init_common_ring(struct intel_engine_cs *engine) submit = true; } - if (submit && !i915.enable_guc_submission) - execlists_submit_ports(engine); + if (!i915.enable_guc_submission) { + if (submit) + execlists_submit_ports(engine); + else if (engine->execlist_first) + tasklet_schedule(&engine->irq_tasklet); + } return 0; } -- 2.14.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx