Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > Quoting Mika Kuoppala (2019-09-19 13:57:45) >> Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: >> >> > Check that we are correctly invalidating the TLB at the start of a >> > batch after updating the GTT. >> > >> > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> >> > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> >> > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx> >> > --- >> > drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 227 ++++++++++++++++++ >> > 1 file changed, 227 insertions(+) >> > >> > diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c >> > index 598c18d10640..f8709b332bd7 100644 >> > --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c >> > +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c >> > @@ -25,13 +25,16 @@ >> > #include <linux/list_sort.h> >> > #include <linux/prime_numbers.h> >> > >> > +#include "gem/i915_gem_context.h" >> > #include "gem/selftests/mock_context.h" >> > +#include "gt/intel_context.h" >> > >> > #include "i915_random.h" >> > #include "i915_selftest.h" >> > >> > #include "mock_drm.h" >> > #include "mock_gem_device.h" >> > +#include "igt_flush_test.h" >> > >> > static void cleanup_freed_objects(struct drm_i915_private *i915) >> > { >> > @@ -1705,6 +1708,229 @@ int i915_gem_gtt_mock_selftests(void) >> > return err; >> > } >> > >> > +static int context_sync(struct intel_context *ce) >> > +{ >> > + struct i915_request *rq; >> > + long timeout; >> > + >> > + rq = intel_context_create_request(ce); >> > + if (IS_ERR(rq)) >> > + return PTR_ERR(rq); >> > + >> > + i915_request_get(rq); >> > + i915_request_add(rq); >> > + >> > + timeout = i915_request_wait(rq, 0, HZ / 5); >> > + i915_request_put(rq); >> > + >> > + return timeout < 0 ? -EIO : 0; >> > +} >> > + >> > +static int submit_batch(struct intel_context *ce, u64 addr) >> > +{ >> > + struct i915_request *rq; >> > + int err; >> > + >> > + rq = intel_context_create_request(ce); >> > + if (IS_ERR(rq)) >> > + return PTR_ERR(rq); >> > + >> > + err = 0; >> > + if (rq->engine->emit_init_breadcrumb) /* detect a hang */ >> >> for seqno write? > > If we expect an initial breadcrumb, we use it during reset to identify > if we are inside a payload (as opposed to being inside the semaphore > wait). So we need to emit the breadcrumb if we may hang in our batch. > >> > + err = rq->engine->emit_init_breadcrumb(rq); >> > + if (err == 0) >> > + err = rq->engine->emit_bb_start(rq, addr, 0, 0); >> > + >> >> In context_sync part we grabbed a reference. In here we >> don't. I don't see how we can get away without it even >> if we don't wait in here. > > Hmm, I suppose you can argue that we do now have a later deref in the > spinner. That wasn't there before... > >> > + vma = i915_vma_instance(out, vm, NULL); >> > + if (IS_ERR(vma)) { >> > + err = PTR_ERR(vma); >> > + goto out_batch; >> > + } >> > + >> > + err = i915_vma_pin(vma, 0, 0, >> > + PIN_USER | >> > + PIN_OFFSET_FIXED | >> > + (vm->total - PAGE_SIZE)); >> > + if (err) >> > + goto out_out; >> >> out_put? > > Joonas likes out: for normal exit paths that double for error handling. Ok. Fine with me. I just like the last part to describe what the first part of onion out does. Don't have to so much scroll back and forth in editor. > >> Oh and we don't have to do anything with the instance on >> error paths? > > No. The vma does not yet have an independent lifetime from the object > (they are all owned by objects currently). > >> > + /* Replace the TLB with target batches */ >> > + for (i = 0; i < count; i++) { >> > + u32 *cs = batch + i * 64 / sizeof(*cs); >> > + u64 addr; >> > + >> > + vma->node.start = offset + i * PAGE_SIZE; >> >> on previous loop, we have now primed the pte to tlb in here? > > We're using the same PTE as before, in the expectation that the TLB > still contains that lookup. > >> > + vm->insert_entries(vm, vma, I915_CACHE_NONE, 0); >> >> ..now changing in it here... >> >> > + >> > + addr = vma->node.start + i * 64; >> > + cs[4] = MI_NOOP; >> > + cs[6] = lower_32_bits(addr); >> > + cs[7] = upper_32_bits(addr); >> > + wmb(); >> > + >> > + err = submit_batch(ce, addr); >> >> in hope that with this submission hardware will see the old one and >> completely miss the store+bb start? > > The hope is we see the right page of course! The test is to detect when > it sees the old page instead. Right, I guess it just depends who is hoping wrt to outcome =) > >> Perhaps rewiring a more dummy emit only to prove this case >> is pushing it. >> >> But I am curious to know if you did try it out by removing >> the invalidate on the emits and managed to bring >> out the missing writes? > > We can demonstrate it on Tigerlake :) > > Indeed it detects the remove of MI_INVALIDATE_TLB elsewhere. Neat addition to our triaging kit this will be. -Mika _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx