On 22/01/18 09:32, Chris Wilson wrote:
Quoting Daniele Ceraolo Spurio (2018-01-22 16:09:28)
On 22/01/18 07:13, Chris Wilson wrote:
Quoting Mika Kuoppala (2018-01-22 15:08:16)
Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx> writes:
On 19/01/18 05:05, Mika Kuoppala wrote:
Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx> writes:
From: Thomas Daniel <thomas.daniel@xxxxxxxxx>
Enhanced Execlists is an upgraded version of execlists which supports
up to 8 ports. The lrcs to be submitted are written to a submit queue,
which is then loaded on the HW. When writing to the ELSP register, the
lrcs are written cyclically in the queue from position 0 to position 7.
Alternatively, it is possible to write directly in the individual
positions of the queue using the ELSQ registers. To be able to re-use
all the existing code we're using the latter method and we're currently
limiting ourself to only using 2 elements.
The preemption flow is sligthly different with enhanced execlists, so
this patch turns preemption off temporarily for Gen11+ while we wait for
the new mechanism to land.
v2: Rebase.
v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
v5: Reword commit, rename regs to be closer to specs, turn off
preemption (Daniele), reuse engine->execlists.elsp (Chris)
Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Thomas Daniel <thomas.daniel@xxxxxxxxx>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx>
Was going to adopt this patch from Rodrigo but you were faster.
I choose to stash the elsq and use it as a gen11 vs rest toggle:
Relevant bits:
+static inline void write_port(struct intel_engine_execlists * const execlists,
+ unsigned int n,
+ u64 desc)
+{
+ if (execlists->elsq)
+ gen11_elsq_write(desc, n, execlists->elsq);
+ else
+ gen8_elsp_write(desc, execlists->elsp);
+}
+
+static inline void submit_ports(struct intel_engine_execlists * const execlists)
+{
+ /* for gen11+ we need to manually load the submit queue */
+ if (execlists->elsq) {
+ struct intel_engine_cs *engine =
+ container_of(execlists,
+ struct intel_engine_cs,
+ execlists);
+ struct drm_i915_private *dev_priv = engine->i915;
+
+ I915_WRITE_FW(RING_ELCR(engine), ELCR_LOAD);
+ }
+}
+
I was undecided about hiding the code in sub-functions because of the
pre-emption path. There is no need in gen11 to inject a context to
preempt to idle,
Really? The preempt-to-idle is so that we can sync the bookkeeping with
the pending CS interrupts. The HW doesn't require it currently either,
it's the SW that does. If you have a way to avoid that, that should be
applicable to the current code as well?
-Chris
We can't avoid preempt-to-idle, we can do it in a simpler way. There is
a bit in RING_EXECLIST_CONTROL that triggers a preemp-to-idle, without
the need to do a ctx injection. We'll need to move preemption completion
detection to the CSB value instead of the HWSP write, not sure about the
impact of that on our bookkeeping.
Ah, shucks :( I was hoping there's a simple way to avoid idling.
I have to ask, is it worth it? If we still have to do a CS interrupt
round trip, what's the difference? Hmm. I wonder if assume that the
preemption is nearly instantaneous (say 10us), we could short-circuit
the interrupt and poll. Falling back to interrupt if longer, and/or
resetting the GPU to guarantee latencies.
-Chris
The SW cost shouldn't be too bad, so having 1 less ctx switch should
give us some some benefits. Trying a poll with a fall-back sounds nice,
but we'll probably have to wait until we have timing info to see if the
polling makes sense at all.
Daniele
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx