[PATCH] drm/i915: Trigger hangcheck if we detect more a repeating missed IRQ

ben at bwidawsk.net (Ben Widawsky) · Tue, 10 Apr 2012 16:59:11 -0700

On Tue, 10 Apr 2012 17:00:41 +0100
Chris Wilson <chris at chris-wilson.co.uk> wrote:

> On the first instance we just wish to kick the waiters and see if that
> terminates the wait conditions. If it does not, then we do not want to
> keep retrying without ever making any forward progress and becoming
> stuck in a hangcheck loop.
> 
> Reported-and-tested-by: Lukas Hejtmanek <xhejtman at fi.muni.cz>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48209
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>

I'm still confused about the problem we are purportedly fixing.

This should happen if we've missed an irq (or the watchdog fired too
soon), and then fires again before the thread has actually woken up to
realize that is missed the first IRQ?

As for extract the kick_ring bit of code for core hangcheck_elapsed,
that looks fine. I just don't quite understand the exact problem this
solves, and can't envision how we hit this case it seems the patch will
fix.

Ben