Quoting Mika Kuoppala (2019-05-29 11:15:46) > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > It seems like the HW validator is getting better at preventing our > > snooping of system registers from non-privileged batches! If we can't > > use SRM, let's probe the register directly through mmio, making sure we > > have the context spinning on the GPU first. > > > > v2: Hold forcewake just in case the spinning batch isn't enough to > > justify our register access. > > > > If I recall correctly, either of them separately didn't > work. And there was delay after grabbing the fw before > the register contents appeared. Don't remember the gen tho. That would be a kernel bug / HW bug. Either we fail in our ack sequence (maybe let the read overtake the fw ack or something equally impossible), or the HW fails in its. If you can think of a way of spotting it, add it to selftests/intel_uncore -- it's exactly the type of test we need in bring up, verifying our mmio is accurate. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110544 > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Matthew Auld <matthew.william.auld@xxxxxxxxx> > > --- > > tests/i915/gem_workarounds.c | 88 +++++++----------------------------- > > 1 file changed, 17 insertions(+), 71 deletions(-) > > > > diff --git a/tests/i915/gem_workarounds.c b/tests/i915/gem_workarounds.c > > index 44e3dce8a..2767b04d7 100644 > > --- a/tests/i915/gem_workarounds.c > > +++ b/tests/i915/gem_workarounds.c > > @@ -80,70 +80,27 @@ static bool write_only(const uint32_t addr) > > return false; > > } > > > > -#define MI_STORE_REGISTER_MEM (0x24 << 23) > > - > > -static int workaround_fail_count(int fd, uint32_t ctx) > > +static int workaround_fail_count(int i915, uint32_t ctx) > > { > > - struct drm_i915_gem_exec_object2 obj[2]; > > - struct drm_i915_gem_relocation_entry *reloc; > > - struct drm_i915_gem_execbuffer2 execbuf; > > - uint32_t result_sz, batch_sz; > > - uint32_t *base, *out; > > - int fail_count = 0; > > - > > - reloc = calloc(num_wa_regs, sizeof(*reloc)); > > - igt_assert(reloc); > > - > > - result_sz = 4 * num_wa_regs; > > - result_sz = PAGE_ALIGN(result_sz); > > - > > - batch_sz = 16 * num_wa_regs + 4; > > - batch_sz = PAGE_ALIGN(batch_sz); > > - > > - memset(obj, 0, sizeof(obj)); > > - obj[0].handle = gem_create(fd, result_sz); > > - gem_set_caching(fd, obj[0].handle, I915_CACHING_CACHED); > > - obj[1].handle = gem_create(fd, batch_sz); > > - obj[1].relocs_ptr = to_user_pointer(reloc); > > - obj[1].relocation_count = num_wa_regs; > > - > > - out = base = gem_mmap__cpu(fd, obj[1].handle, 0, batch_sz, PROT_WRITE); > > - for (int i = 0; i < num_wa_regs; i++) { > > - *out++ = MI_STORE_REGISTER_MEM | ((gen >= 8 ? 4 : 2) - 2); > > - *out++ = wa_regs[i].addr; > > - reloc[i].target_handle = obj[0].handle; > > - reloc[i].offset = (out - base) * sizeof(*out); > > - reloc[i].delta = i * sizeof(uint32_t); > > - reloc[i].read_domains = I915_GEM_DOMAIN_INSTRUCTION; > > - reloc[i].write_domain = I915_GEM_DOMAIN_INSTRUCTION; > > - *out++ = reloc[i].delta; > > - if (gen >= 8) > > - *out++ = 0; > > - } > > - *out++ = MI_BATCH_BUFFER_END; > > - munmap(base, batch_sz); > > + igt_spin_t *spin; > > + int fw, fail = 0; > > > > - memset(&execbuf, 0, sizeof(execbuf)); > > - execbuf.buffers_ptr = to_user_pointer(obj); > > - execbuf.buffer_count = 2; > > - execbuf.rsvd1 = ctx; > > - gem_execbuf(fd, &execbuf); > > + spin = igt_spin_new(i915, .ctx = ctx, .flags = IGT_SPIN_POLL_RUN); > > + igt_spin_busywait_until_started(spin); > > > > - gem_set_domain(fd, obj[0].handle, I915_GEM_DOMAIN_CPU, 0); > > - > > - igt_debug("Address\tval\t\tmask\t\tread\t\tresult\n"); > > - > > - out = gem_mmap__cpu(fd, obj[0].handle, 0, result_sz, PROT_READ); > > + fw = igt_open_forcewake_handle(i915); > > assert that it went fine? Do we always expect the debugfs to be present? Do we strictly need it? igt_debug if it isn't present and correlate that to a fail. > Perhaps both will now do the trick. But if it fails > get the forcewake before spinner so you get more delay. Nah, I vote that in that case forcewake is broken and needs a kernel fix, as we don't have any such delay when using I915_READ(). -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx