On 17/02/2017 10:18, Chris Wilson wrote:
If the waiter was currently running, assume it hasn't had a chance to process the pending interupt (e.g, low priority task on a loaded system) and wait until it sleeps before declaring a missed interrupt. References: https://bugs.freedesktop.org/show_bug.cgi?id=99816 Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> --- drivers/gpu/drm/i915/intel_breadcrumbs.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c index 4395b177493e..2ad29fb77b2d 100644 --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c @@ -45,6 +45,15 @@ static void intel_breadcrumbs_hangcheck(unsigned long data) return; } + /* If the waiter was currently running, assume it hasn't had a chance + * to process the pending interupt (e.g, low priority task on a loaded + * system) and wait until it sleeps before declaring a missed interrupt. + */ + if (!intel_engine_wakeup(engine)) { + mod_timer(&b->hangcheck, wait_timeout()); + return; + } + DRM_DEBUG("Hangcheck timer elapsed... %s idle\n", engine->name); set_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings); mod_timer(&engine->breadcrumbs.fake_irq, jiffies + 1);
Change here is that we would never declare a GPU hang is userspace would just wait indefinitely, or in other words with this patch we would rely on userspace timing out on their waits in order to declare a hang.
Hm, in fact even with the current code, if the userspace keeps exiting and re-entering the wait we would be re-arming the hangcheck timer and so also never notice a GPU hang.
Regards, Tvrtko _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx