On Mon, May 20, 2019 at 07:19:58AM +0100, Tvrtko Ursulin wrote: > > On 17/05/2019 02:25, John Harrison wrote: > > On 5/15/2019 01:52, Tvrtko Ursulin wrote: > > > > > > On 13/05/2019 20:45, John.C.Harrison@xxxxxxxxx wrote: > > > > From: John Harrison <John.C.Harrison@xxxxxxxxx> > > > > > > > > With virtual engines, it is no longer possible to know which specific > > > > physical engine a given request will be executed on at the time that > > > > request is generated. This means that the request itself must be engine > > > > agnostic - any direct register writes must be relative to the engine > > > > and not absolute addresses. > > > > > > > > The LRI command has support for engine relative addressing. However, > > > > the mechanism is not transparent to the driver. The scheme for Gen11 > > > > (MI_LRI_ADD_CS_MMIO_START) requires the LRI address to have no > > > > absolute engine base component. The hardware then adds on the correct > > > > engine offset at execution time. > > > > > > > > Due to the non-trivial and differing schemes on different hardware, it > > > > is not possible to simply update the code that creates the LRI > > > > commands to set a remap flag and let the hardware get on with it. > > > > Instead, this patch adds function wrappers for generating the LRI > > > > command itself and then for constructing the correct address to use > > > > with the LRI. > > > > > > > > v2: Fix build break in GVT. Remove flags parameter [review feedback > > > > from Chris W]. > > > > > > > > v3: Fix build break in selftest. Rebase to newer base tree and fix > > > > merge conflict. > > > > > > > > v4: More rebasing. Rmoved relative addressing support from Gen7-9 only > > > > code paths [review feedback from Chris W]. > > > > > > > > v5: More rebasing (new 'gt' directory). Fix white space issue. Use > > > > COPY class rather than BCS0 id for checking against BCS engine. > > > > > > > > Signed-off-by: John Harrison <John.C.Harrison@xxxxxxxxx> > > > > CC: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx> > > > > CC: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> > > > > CC: Wilson, Chris P <chris.p.wilson@xxxxxxxxx> > > > > --- > > > > drivers/gpu/drm/i915/gt/intel_engine.h | 4 + > > > > drivers/gpu/drm/i915/gt/intel_engine_cs.c | 11 +++ > > > > drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 6 +- > > > > drivers/gpu/drm/i915/gt/intel_lrc.c | 79 ++++++++++--------- > > > > drivers/gpu/drm/i915/gt/intel_lrc_reg.h | 4 +- > > > > drivers/gpu/drm/i915/gt/intel_mocs.c | 17 ++-- > > > > drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 45 +++++++++-- > > > > drivers/gpu/drm/i915/gt/intel_workarounds.c | 4 +- > > > > .../gpu/drm/i915/gt/selftest_workarounds.c | 9 ++- > > > > drivers/gpu/drm/i915/gvt/mmio_context.c | 16 +++- > > > > drivers/gpu/drm/i915/i915_cmd_parser.c | 4 +- > > > > drivers/gpu/drm/i915/i915_gem_context.c | 12 +-- > > > > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +- > > > > drivers/gpu/drm/i915/i915_perf.c | 19 +++-- > > > > 14 files changed, 154 insertions(+), 79 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h > > > > b/drivers/gpu/drm/i915/gt/intel_engine.h > > > > index 9359b3a7ad9c..3506c992182c 100644 > > > > --- a/drivers/gpu/drm/i915/gt/intel_engine.h > > > > +++ b/drivers/gpu/drm/i915/gt/intel_engine.h > > > > @@ -546,4 +546,8 @@ static inline bool > > > > inject_preempt_hang(struct intel_engine_execlists *execlists) > > > > #endif > > > > +bool i915_engine_has_relative_lri(const struct > > > > intel_engine_cs *engine); > > > > +u32 i915_get_lri_cmd(const struct intel_engine_cs *engine, u32 > > > > word_count); > > > > +u32 i915_get_lri_reg(const struct intel_engine_cs *engine, > > > > i915_reg_t reg); > > > > + > > > > #endif /* _INTEL_RINGBUFFER_H_ */ > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > > > > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > > > > index 4c3753c1b573..233295d689d2 100644 > > > > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > > > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > > > > @@ -253,6 +253,17 @@ static u32 __engine_mmio_base(struct > > > > drm_i915_private *i915, > > > > return bases[i].base; > > > > } > > > > +bool i915_engine_has_relative_lri(const struct > > > > intel_engine_cs *engine) > > > > +{ > > > > + if (INTEL_GEN(engine->i915) < 11) > > > > + return false; > > > > + > > > > + if (engine->class == COPY_ENGINE_CLASS) > > > > + return false; > > > > + > > > > + return true; > > > > > > I think engine->flags would be better. At least it is one > > > conditional instead of two at runtime, even one too many for > > > something that is invariant. > > > > > > > +} > > > > + > > > > static void __sprint_engine_name(char *name, const struct > > > > engine_info *info) > > > > { > > > > WARN_ON(snprintf(name, INTEL_ENGINE_CS_MAX_NAME, "%s%u", > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h > > > > b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h > > > > index a34ece53a771..e7784b3fb759 100644 > > > > --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h > > > > +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h > > > > @@ -123,9 +123,13 @@ > > > > * simply ignores the register load under certain conditions. > > > > * - One can actually load arbitrary many arbitrary registers: > > > > Simply issue x > > > > * address/value pairs. Don't overdue it, though, x <= 2^4 > > > > must hold! > > > > + * - Newer hardware supports engine relative addresses but > > > > older hardware does > > > > + * not. So never call MI_LRI directly, always use the > > > > i915_get_lri_cmd() > > > > + * and i915_get_lri_reg() helper functions. > > > > */ > > > > -#define MI_LOAD_REGISTER_IMM(x) MI_INSTR(0x22, 2*(x)-1) > > > > +#define __MI_LOAD_REGISTER_IMM(x) MI_INSTR(0x22, 2*(x)-1) > > > > > > So we end up with code using __MI_LOAD_REGISTER_IMM for absolute, > > > and i915_get_lri_cmd for relative addressing. One is a macro, > > > another is a function call, and naming/case is different. > > > > No. The __M_L_R_IMM version is for old code that can only run on > > pre-Gen11 devices. The helper function is for all other code. The caller > > does not know (or care) whether it should be using absolute or relative > > addressing. That is the point of the helper. > > > > See earlier discussion about wanting to make it totally obvious that > > using the simple macro version is the wrong thing to do in new code. > > > > > > > ... > > > > > > Then all call sites can use just this single helper and the naming > > > remains familiar. > > > > > > > I originally had a single helper to be used in all call sites. Chris > > objected violently to the idea of calling a helper in code that is > > purely pre-Gen11 and thus has no need of the helper. > > > > On the other hand, Rodrigo agreed with you that using the helper > > everywhere was cleaner. So it is now a 2 vs 1 vote... > > Maybe leaving the legacy code use the existing MI_LOAD_REGISTER_IMM then, > and just calling the new one MI_LOAD_REGISTER_IMM_REL would be passable? It > would satisfy my complaint that two helpers look so radically different. > Would make Chris happy and would make the patch smaller I think. I prefer this approach as well. > > Regards, > > Tvrtko > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx