Re: [PATCH v2 3/3] drm/i915: Defer declaration of missed-interrupt until the waiter is asleep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 17/02/2017 10:58, Chris Wilson wrote:
On Fri, Feb 17, 2017 at 10:48:50AM +0000, Tvrtko Ursulin wrote:

On 17/02/2017 10:18, Chris Wilson wrote:
If the waiter was currently running, assume it hasn't had a chance
to process the pending interupt (e.g, low priority task on a loaded
system) and wait until it sleeps before declaring a missed interrupt.

References: https://bugs.freedesktop.org/show_bug.cgi?id=99816
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx>
---
drivers/gpu/drm/i915/intel_breadcrumbs.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index 4395b177493e..2ad29fb77b2d 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -45,6 +45,15 @@ static void intel_breadcrumbs_hangcheck(unsigned long data)
		return;
	}

+	/* If the waiter was currently running, assume it hasn't had a chance
+	 * to process the pending interupt (e.g, low priority task on a loaded
+	 * system) and wait until it sleeps before declaring a missed interrupt.
+	 */
+	if (!intel_engine_wakeup(engine)) {
+		mod_timer(&b->hangcheck, wait_timeout());
+		return;
+	}
+
	DRM_DEBUG("Hangcheck timer elapsed... %s idle\n", engine->name);
	set_bit(engine->id, &engine->i915->gpu_error.missed_irq_rings);
	mod_timer(&engine->breadcrumbs.fake_irq, jiffies + 1);


Change here is that we would never declare a GPU hang is userspace
would just wait indefinitely, or in other words with this patch we
would rely on userspace timing out on their waits in order to
declare a hang.

Surely you mean the other way around? The only way we get to now declare a
missed-interrupt and then queue a hangcheck here is if userspace sleeps.

Hm, in fact even with the current code, if the userspace keeps
exiting and re-entering the wait we would be re-arming the hangcheck
timer and so also never notice a GPU hang.

Correct. It is not the only way we arm the GPU hangcheck.
gem_busy/hang, gem_wait/busy-hang check that we do detect hangs even if
userspace never sleeps.

Looks good after some more digging through the code and a brief IRC discussion. We only fall back to rapid wakeups (fake_irq) if there are waiters now, which is inline with the rest of the code.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux