Tomas Elf <tomas.elf@xxxxxxxxx> writes: > On 22/05/2015 18:05, Mika Kuoppala wrote: >> During review of dynamic page tables series, I was able >> to hit a lite restore bug with execlists. I assume that >> due to incorrect pd, the batch run out of legit address space >> and into the scratch page area. The ACTHD was increasing >> due to scratch being all zeroes (MI_NOOPs). And as gen8 >> address space is quite large, the hangcheck happily waited >> for a long long time, keeping the process effectively stuck. >> >> According to Chris Wilson any modern gpu will grind to halt >> if it encounters commands of all ones. This seemed to do the >> trick and hang was declared promptly when the gpu wandered into >> the scratch land. >> >> v2: Use 0xffff00ff pattern (Chris) > > Just for my own benefit: > > 1. Is there any particular reason for this pattern rather than 0xffffffff? > > 2. Someone please correct me if I'm wrong here but at least based on my > own experiences with gen9 submitting batch buffers filled with bad > instructions (0xffffffff) to the GPU does not hang it. I'm guessing that > is because there's allegedly a hardware security parser that MI_NOOPs > out invalid instructions during execution. If that's the case here then > I guess we might have to come up with something else for gen9+ if we > want to induce engine hangs once the execution reaches the scratch page? > If that is the case with gen9, then we need more ducttape. Like that we always increase busyness in hangcheck (a little) to finally declare a hang even tho no loops are detected. But with this and gen < 9, the execution grinds to a halt and I get hang in a 5 second window. -Mika > On the other hand, on gen9+ page faulting is supposedly not broken > anymore so maybe we don't need the scratch page to begin with there so > maybe it's all moot at that point? Again, if I'm making no sense here > feel free to set things straight, I'm very curious about how all of this > is supposed to work. > > Thanks, > Tomas > >> >> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> >> Signed-off-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> >> --- >> drivers/gpu/drm/i915/i915_gem_gtt.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c >> index 43fa543..a2a0c88 100644 >> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c >> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c >> @@ -2168,6 +2168,8 @@ void i915_global_gtt_cleanup(struct drm_device *dev) >> vm->cleanup(vm); >> } >> >> +#define SCRATCH_PAGE_MAGIC 0xffff00ffffff00ffULL >> + >> static int alloc_scratch_page(struct i915_address_space *vm) >> { >> struct i915_page_scratch *sp; >> @@ -2185,6 +2187,7 @@ static int alloc_scratch_page(struct i915_address_space *vm) >> return ret; >> } >> >> + fill_px(vm->dev, sp, SCRATCH_PAGE_MAGIC); >> set_pages_uc(px_page(sp), 1); >> >> vm->scratch_page = sp; >> _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx