Quoting Mika Kuoppala (2019-10-31 14:32:05) > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > Check that the context's ring register state still matches our > > expectations prior to execution. > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > > --- > > drivers/gpu/drm/i915/gt/intel_lrc.c | 73 ++++++++++++++++++++----- > > drivers/gpu/drm/i915/gt/intel_lrc_reg.h | 4 +- > > drivers/gpu/drm/i915/gt/selftest_lrc.c | 4 +- > > 3 files changed, 63 insertions(+), 18 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c > > index 5f61cd128d9c..666e70931524 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > > @@ -1146,6 +1146,60 @@ execlists_schedule_out(struct i915_request *rq) > > i915_request_put(rq); > > } > > > > +static int lrc_ring_mi_mode(const struct intel_engine_cs *engine) > > +{ > > + if (INTEL_GEN(engine->i915) >= 12) > > + return 0x60; > > + else if (INTEL_GEN(engine->i915) >= 9) > > + return 0x54; > > + else if (engine->class == RENDER_CLASS) > > + return 0x58; > > + else > > + return -1; > > +} > > + > > +static void > > +execlists_check_context(const struct intel_context *ce, > > + const struct intel_engine_cs *engine) > > +{ > > + const struct intel_ring *ring = ce->ring; > > + u32 *regs = ce->lrc_reg_state; > > + int x; > > + > > + if (regs[CTX_RING_START] != i915_ggtt_offset(ring->vma)) { > > + pr_err_once("%s: context submitted with incorrect RING_BUFFER_START [%08x], expected %08x\n", > > + engine->name, > > + regs[CTX_RING_START], > > + i915_ggtt_offset(ring->vma)); > > + regs[CTX_RING_START] = i915_ggtt_offset(ring->vma); > > + } > > + > > + if ((regs[CTX_RING_CTL] & ~(RING_WAIT | RING_WAIT_SEMAPHORE)) != > > + (RING_CTL_SIZE(ring->size) | RING_VALID)) { > > + pr_err_once("%s: context submitted with incorrect RING_BUFFER_CONTROL [%08x], expected %08x\n", > > + engine->name, > > + regs[CTX_RING_CTL], > > + (u32)(RING_CTL_SIZE(ring->size) | RING_VALID)); > > + regs[CTX_RING_CTL] = RING_CTL_SIZE(ring->size) | RING_VALID; > > We are going to submit by clearing the waits. First I thought this will > lead to disaster but the hardware works so that we clear on writing '1'. Afaik, they are only indicator bits. So the HW ignores those bits when we write to the register. > > + if (regs[CTX_BB_STATE] != RING_BB_PPGTT) { > > + pr_err_once("%s: context submitted with incorrect BB_STATE [%08x], expected %08x\n", > > + engine->name, > > + regs[CTX_BB_STATE], > > + RING_BB_PPGTT); > > + regs[CTX_BB_STATE] = RING_BB_PPGTT; > > + } > > + > > + x = lrc_ring_mi_mode(engine); > > + if (x != -1 && regs[x + 1] & STOP_RING) { > > + pr_err_once("%s: context submitted with STOP_RING [%08x] in RING_MI_MODE\n", > > + engine->name, regs[x + 1]); > > + regs[x + 1] &= ~STOP_RING; > > + regs[x + 1] |= STOP_RING << 16; > > Ok you want only to care about this one. Was pondering of restoring rest > of the state from previous. It was mostly in the spirit of "now that we are here, we might as well fix it up". It's mainly the paranoia in trying to ascertain if we are feeding garbage into the GPU. There's probably a lot more we can check, but the ring registers are essential :) > Will the hardware even care on masks at this point or is the saving part > setting them accordingly? Aiui, it uses the masks on the implicit LRI in the context restore to control where or not to actually apply the bits. Not that I have checked to see what the HW is doing (we will see if it ever flags an error), but memory says from elsewere it uses 0xffffxxxx. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx