Re: [PATCH] drm/i915: Reduce engine->emit_flush() to a single mode parameter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27/07/16 11:53, Chris Wilson wrote:
Rather than passing a complete set of GPU cache domains for either
invalidation or for flushing, or even both, just pass a single parameter
to the engine->emit_flush to determine the required operations.

engine->emit_flush(GPU, 0) -> engine->emit_flush(EMIT_INVALIDATE)
engine->emit_flush(0, GPU) -> engine->emit_flush(EMIT_FLUSH)
engine->emit_flush(GPU, GPU) -> engine->emit_flush(EMIT_FLUSH | EMIT_INVALIDATE)

This allows us to extend the behaviour easily in future, for example if
we want just a command barrier without the overhead of flushing.

Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Dave Gordon <david.s.gordon@xxxxxxxxx>
Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>
---
 drivers/gpu/drm/i915/i915_gem_context.c    |  2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 10 ++----
 drivers/gpu/drm/i915/i915_gem_request.c    |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c           | 23 +++++-------
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 57 +++++++++++-------------------
 drivers/gpu/drm/i915/intel_ringbuffer.h    |  6 ++--
 7 files changed, 38 insertions(+), 64 deletions(-)

[snip]

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 39fa9eb10514..671b1cab5e54 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1666,8 +1666,7 @@ static int hsw_mm_switch(struct i915_hw_ppgtt *ppgtt,
 	int ret;

 	/* NB: TLBs must be flushed and invalidated before a switch */
-	ret = engine->emit_flush(req,
-				 I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
+	ret = engine->emit_flush(req, EMIT_INVALIDATE | EMIT_FLUSH);

The example in the commit message says "EMIT_FLUSH | EMIT_INVALIDATE"
which I think looks much nicer. Flush *after* invalidate would be pretty meaningless!

For even more syntactic sugar, we could choose "FLUSH + INVALIDATE" as + and | are equivalent for disjoint bitfields.

 int intel_logical_ring_alloc_request_extras(struct drm_i915_gem_request *request)
@@ -998,9 +998,7 @@ static int intel_logical_ring_workarounds_emit(struct drm_i915_gem_request *req)
 	if (w->count == 0)
 		return 0;

-	ret = req->engine->emit_flush(req,
-				      I915_GEM_GPU_DOMAINS,
-				      I915_GEM_GPU_DOMAINS);
+	ret = req->engine->emit_flush(req, EMIT_BARRIER);
 	if (ret)
 		return ret;

Distinguishing flush-and-invalidate from BARRIER seems like a good idea, because here we really don't want to flush or invalidate the GPU's caches; we really only want some sort of synchronisation of memory activity before changing the register. Therefore ...

[snip]

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 00723401f98c..76d0495943c3 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -292,8 +292,10 @@ struct intel_engine_cs {
 	u32 ctx_desc_template;
 	int		(*emit_request)(struct drm_i915_gem_request *request);
 	int		(*emit_flush)(struct drm_i915_gem_request *request,
-				      u32 invalidate_domains,
-				      u32 flush_domains);
+				      u32 mode);
+#define EMIT_INVALIDATE	BIT(0)
+#define EMIT_FLUSH	BIT(1)
+#define EMIT_BARRIER	(EMIT_INVALIDATE | EMIT_FLUSH)

... I think these should be three (or maybe four) distinct operations, and collapsing BARRIER to FLUSH+INVALIDATE if the h/w can't do a simple barrier should be left to the gen-specific code. Also what about allowing for a FLUSH or INVALIDATE without BARRIER semantics? Do we care about whether the command streamer stalls before/after these operations?

So maybe start with

#define	EMIT_CS_STALL	BIT(0)	/* Stall CS before operation	*/
#define	EMIT_WRITEBACK	BIT(1)	/* Ensure caches written back	*/
#define	EMIT_DROP	BIT(2)	/* Drop contents of cache(s)	*/
#define	EMIT_POSTSYNC	BIT(3)	/* Wait until complete		*/

and then

#define	EMIT_BARRIER	(EMIT_CS_STALL+EMIT_POSTSYNC)
#define	EMIT_FLUSH	(EMIT_CS_STALL+EMIT_WRITEBACK)
#define	EMIT_INVALIDATE	(EMIT_DROP+EMIT_POSTSYNC)
#define	EMIT_SWITCH	(EMIT_FLUSH+EMIT_INVALIDATE)

or whatever combinations of basic operations are actually useful.

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux