During review of dynamic page tables series, I was able to hit a lite restore bug with execlists. I assume that due to incorrect pd, the batch run out of legit address space and into the scratch page area. The ACTHD was increasing due to scratch being all zeroes (MI_NOOPs). And as gen8 address space is quite large, the hangcheck happily waited for a long long time, keeping the process effectively stuck. According to Chris Wilson any modern gpu will grind to halt if it encounters commands of all ones. This seemed to do the trick and hang was declared promptly when the gpu wandered into the scratch land. v2: Use 0xffff00ff pattern (Chris) Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Signed-off-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> --- drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 0cc0cf4..be6521f 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -2136,6 +2136,8 @@ void i915_global_gtt_cleanup(struct drm_device *dev) vm->cleanup(vm); } +#define SCRATCH_PAGE_MAGIC 0xffff00ffffff00ffULL + static int alloc_scratch_page(struct i915_address_space *vm) { struct i915_page_scratch *sp; @@ -2153,6 +2155,7 @@ static int alloc_scratch_page(struct i915_address_space *vm) return ret; } + fill_px(vm->dev, sp, SCRATCH_PAGE_MAGIC); set_pages_uc(px_page(sp), 1); vm->scratch_page = sp; -- 1.9.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx