We rely on ksoftirqd to run in a timely fashion in order to drain the execlists queue. Quite frequently, it does not. In some cases we may see latencies of over 200ms triggering our idle timeouts and forcing us to declare the driver wedged! Thus we can speed up idle detection by bypassing ksoftirqd in these cases and flush our tasklet to confirm if we are indeed still waiting for the ELSP to drain. v2: Put the execlists.first check back; it is required for handling reset! References: https://bugs.freedesktop.org/show_bug.cgi?id=106373 Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> Reviewed-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> --- drivers/gpu/drm/i915/intel_engine_cs.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 70325e0824e3..0f45d6826c8e 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -945,10 +945,19 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine) return true; /* Waiting to drain ELSP? */ - if (READ_ONCE(engine->execlists.active)) - return false; + if (READ_ONCE(engine->execlists.active)) { + /* + * ksoftirqd has notorious latency that may cause us to + * timeout while waiting for the engine to idle as we wait for + * ksoftirqd to run the execlists tasklet to drain the ELSP. + * If we are expecting a context switch from the GPU, check now. + */ + execlists_tasklet(&engine->execlists); + if (READ_ONCE(engine->execlists.active)) + return false; + } - /* ELSP is empty, but there are ready requests? */ + /* ELSP is empty, but there are ready requests? E.g. after reset */ if (READ_ONCE(engine->execlists.first)) return false; -- 2.17.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx