Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > Quoting Mika Kuoppala (2020-06-16 09:55:04) >> >> Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: >> >> > Not too long ago, we realised we had issues with a rolling back a >> > context so far for a preemption request we considered the resubmit not >> > to be a rollback but a forward roll. This means we would issue a lite >> > restore instead of forcing a full restore, continuing execution of the >> > old requests rather than causing a preemption. Add a selftest to >> > exercise such a far rollback, such that if we were to skip the full >> > restore, we would execute invalid instructions in the ring and hang. >> > >> > Note that while I was able to confirm that this causes us to do a >> > lite-restore preemption rollback (with commit e36ba817fa96 ("drm/i915/gt: >> > Incrementally check for rewinding") disabled), it did not trick the HW >> > into rolling past the old RING_TAIL. Myybe on other HW. >> > >> > References: e36ba817fa96 ("drm/i915/gt: Incrementally check for rewinding") >> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> >> > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> >> > --- >> > drivers/gpu/drm/i915/gt/selftest_lrc.c | 150 +++++++++++++++++++++++++ >> > 1 file changed, 150 insertions(+) >> > >> > diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c >> > index 91543494f595..3d088116a055 100644 >> > --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c >> > +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c >> > @@ -363,6 +363,155 @@ static int live_unlite_preempt(void *arg) >> > return live_unlite_restore(arg, I915_USER_PRIORITY(I915_PRIORITY_MAX)); >> > } >> > >> > +static int live_unlite_ring(void *arg) >> > +{ >> > + struct intel_gt *gt = arg; >> > + struct intel_engine_cs *engine; >> > + struct igt_spinner spin; >> > + enum intel_engine_id id; >> > + int err = 0; >> > + >> > + /* >> > + * Setup a preemption event that will cause almost the entire ring >> > + * to be unwound, potentially fooling our intel_ring_direction() >> > + * into emitting a forward lite-restore instead of the rollback. >> > + */ >> > + >> > + if (igt_spinner_init(&spin, gt)) >> > + return -ENOMEM; >> > + >> > + for_each_engine(engine, gt, id) { >> > + struct intel_context *ce[2] = {}; >> > + struct i915_request *rq; >> > + struct igt_live_test t; >> > + int n; >> > + >> > + if (!intel_engine_has_preemption(engine)) >> > + continue; >> > + >> > + if (!intel_engine_can_store_dword(engine)) >> > + continue; >> > + >> > + if (igt_live_test_begin(&t, gt->i915, __func__, engine->name)) { >> > + err = -EIO; >> > + break; >> > + } >> > + engine_heartbeat_disable(engine); >> > + >> > + for (n = 0; n < ARRAY_SIZE(ce); n++) { >> > + struct intel_context *tmp; >> > + >> > + tmp = intel_context_create(engine); >> > + if (IS_ERR(tmp)) { >> > + err = PTR_ERR(tmp); >> > + goto err_ce; >> > + } >> > + >> > + err = intel_context_pin(tmp); >> > + if (err) { >> > + intel_context_put(tmp); >> > + goto err_ce; >> > + } >> > + >> > + memset32(tmp->ring->vaddr, >> > + 0xdeadbeef, /* trigger a hang if executed */ >> > + tmp->ring->vma->size / sizeof(u32)); >> > + >> > + ce[n] = tmp; >> > + } >> > + >> > + rq = igt_spinner_create_request(&spin, ce[0], MI_ARB_CHECK); >> > + if (IS_ERR(rq)) { >> > + err = PTR_ERR(rq); >> > + goto err_ce; >> > + } >> > + >> > + i915_request_get(rq); >> > + rq->sched.attr.priority = I915_PRIORITY_BARRIER; >> > + i915_request_add(rq); >> > + >> > + if (!igt_wait_for_spinner(&spin, rq)) { >> > + intel_gt_set_wedged(gt); >> > + i915_request_put(rq); >> > + err = -ETIME; >> > + goto err_ce; >> > + } >> > + >> > + /* Fill the ring, until we will cause a wrap */ >> > + n = 0; >> > + while (intel_ring_direction(ce[0]->ring, >> > + rq->wa_tail, >> > + ce[0]->ring->tail) <= 0) { >> > + struct i915_request *tmp; >> >> I got that you tested it with revert of incremental, but >> >> can we make 2 versions of this test so that the half ring size >> is honoured and then another where we do few requests past the half? > > We have examples of normal preemption. This chooses to focus on the > impact of intel_ring_direction(). Reviewed-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > >> Just would like to see the hardware get confused according >> to our assertions. > > I haven't tricked the HW into doing anything unexpected. I've tried > switching the spinner out for a semaphore in the ring (in case that > would keep the ring registers primed) and I've tried releasing the > spinner at the same time as trying to submit the preemption (though that > will be incredibly timing dependent) with the aim of having it process > the request tail at the same time as the ELSP. > -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx