On 5/15/2019 01:52, Tvrtko Ursulin wrote:
On 13/05/2019 20:45, John.C.Harrison@xxxxxxxxx wrote:
From: John Harrison <John.C.Harrison@xxxxxxxxx>
With virtual engines, it is no longer possible to know which specific
physical engine a given request will be executed on at the time that
request is generated. This means that the request itself must be engine
agnostic - any direct register writes must be relative to the engine
and not absolute addresses.
The LRI command has support for engine relative addressing. However,
the mechanism is not transparent to the driver. The scheme for Gen11
(MI_LRI_ADD_CS_MMIO_START) requires the LRI address to have no
absolute engine base component. The hardware then adds on the correct
engine offset at execution time.
Due to the non-trivial and differing schemes on different hardware, it
is not possible to simply update the code that creates the LRI
commands to set a remap flag and let the hardware get on with it.
Instead, this patch adds function wrappers for generating the LRI
command itself and then for constructing the correct address to use
with the LRI.
v2: Fix build break in GVT. Remove flags parameter [review feedback
from Chris W].
v3: Fix build break in selftest. Rebase to newer base tree and fix
merge conflict.
v4: More rebasing. Rmoved relative addressing support from Gen7-9 only
code paths [review feedback from Chris W].
v5: More rebasing (new 'gt' directory). Fix white space issue. Use
COPY class rather than BCS0 id for checking against BCS engine.
Signed-off-by: John Harrison <John.C.Harrison@xxxxxxxxx>
CC: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx>
CC: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx>
CC: Wilson, Chris P <chris.p.wilson@xxxxxxxxx>
---
drivers/gpu/drm/i915/gt/intel_engine.h | 4 +
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 11 +++
drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 6 +-
drivers/gpu/drm/i915/gt/intel_lrc.c | 79 ++++++++++---------
drivers/gpu/drm/i915/gt/intel_lrc_reg.h | 4 +-
drivers/gpu/drm/i915/gt/intel_mocs.c | 17 ++--
drivers/gpu/drm/i915/gt/intel_ringbuffer.c | 45 +++++++++--
drivers/gpu/drm/i915/gt/intel_workarounds.c | 4 +-
.../gpu/drm/i915/gt/selftest_workarounds.c | 9 ++-
drivers/gpu/drm/i915/gvt/mmio_context.c | 16 +++-
drivers/gpu/drm/i915/i915_cmd_parser.c | 4 +-
drivers/gpu/drm/i915/i915_gem_context.c | 12 +--
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +-
drivers/gpu/drm/i915/i915_perf.c | 19 +++--
14 files changed, 154 insertions(+), 79 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h
b/drivers/gpu/drm/i915/gt/intel_engine.h
index 9359b3a7ad9c..3506c992182c 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -546,4 +546,8 @@ static inline bool inject_preempt_hang(struct
intel_engine_execlists *execlists)
#endif
+bool i915_engine_has_relative_lri(const struct intel_engine_cs
*engine);
+u32 i915_get_lri_cmd(const struct intel_engine_cs *engine, u32
word_count);
+u32 i915_get_lri_reg(const struct intel_engine_cs *engine,
i915_reg_t reg);
+
#endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4c3753c1b573..233295d689d2 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -253,6 +253,17 @@ static u32 __engine_mmio_base(struct
drm_i915_private *i915,
return bases[i].base;
}
+bool i915_engine_has_relative_lri(const struct intel_engine_cs
*engine)
+{
+ if (INTEL_GEN(engine->i915) < 11)
+ return false;
+
+ if (engine->class == COPY_ENGINE_CLASS)
+ return false;
+
+ return true;
I think engine->flags would be better. At least it is one conditional
instead of two at runtime, even one too many for something that is
invariant.
+}
+
static void __sprint_engine_name(char *name, const struct
engine_info *info)
{
WARN_ON(snprintf(name, INTEL_ENGINE_CS_MAX_NAME, "%s%u",
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index a34ece53a771..e7784b3fb759 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -123,9 +123,13 @@
* simply ignores the register load under certain conditions.
* - One can actually load arbitrary many arbitrary registers:
Simply issue x
* address/value pairs. Don't overdue it, though, x <= 2^4 must
hold!
+ * - Newer hardware supports engine relative addresses but older
hardware does
+ * not. So never call MI_LRI directly, always use the
i915_get_lri_cmd()
+ * and i915_get_lri_reg() helper functions.
*/
-#define MI_LOAD_REGISTER_IMM(x) MI_INSTR(0x22, 2*(x)-1)
+#define __MI_LOAD_REGISTER_IMM(x) MI_INSTR(0x22, 2*(x)-1)
So we end up with code using __MI_LOAD_REGISTER_IMM for absolute, and
i915_get_lri_cmd for relative addressing. One is a macro, another is a
function call, and naming/case is different.
No. The __M_L_R_IMM version is for old code that can only run on
pre-Gen11 devices. The helper function is for all other code. The caller
does not know (or care) whether it should be using absolute or relative
addressing. That is the point of the helper.
See earlier discussion about wanting to make it totally obvious that
using the simple macro version is the wrong thing to do in new code.
...
Then all call sites can use just this single helper and the naming
remains familiar.
I originally had a single helper to be used in all call sites. Chris
objected violently to the idea of calling a helper in code that is
purely pre-Gen11 and thus has no need of the helper.
On the other hand, Rodrigo agreed with you that using the helper
everywhere was cleaner. So it is now a 2 vs 1 vote...
John.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx