This was my pet project for the last few days, but I have to take a break from working on it for now to do some real work ;-). The patches compile, and pass a basic test, but that's about it. There is still quite a bit of work left to make this useful. The easiest thing would be to tie this into error state. The idea is pretty simple. It uses the HW watchdog to set a timeout per batchbuffer, instead of a global software watchdog. Pros: * Potential for per batch, or ring watchdog values. I believe when/if we get to GPGPU workloads, this is particularly interesting. * Batch granularity hang detection. This mostly just makes hang detection and recovery a bit easier IMO. Cons: * Blit ring doesn't have an interrupt. This means we still need the software watchdog, and it makes hang detection more complex. I've been led to believe future HW *may* have this interrupt. * Semaphores I'm looking for feedback, mainly for Daniel, and Chris if this is worth pursuing further when I have more time. The idea would be to eventually use this to implement much of the ARB robustness requirements instead of doing a bunch of request list processing. Ben Widawsky (4): drm/i915: Use HW watchdog for each batch drm/i915: Turn on watchdog interrupts drm/i915: Add a breadcrumb drm/i915: Display the failing seqno drivers/gpu/drm/i915/i915_drv.h | 2 +- drivers/gpu/drm/i915/i915_irq.c | 14 ++++++++++++++ drivers/gpu/drm/i915/i915_reg.h | 5 +++++ drivers/gpu/drm/i915/intel_ringbuffer.c | 34 ++++++++++++++++++++++++++++++--- drivers/gpu/drm/i915/intel_ringbuffer.h | 3 +++ 5 files changed, 54 insertions(+), 4 deletions(-) -- 1.7.11.2