Re: [PATCH v4] drm/i915: Adjust size of PIPE_CONTROL used for gen8 render seqno write

Dave Gordon <david.s.gordon@xxxxxxxxx> · Tue, 12 Apr 2016 17:18:46 +0100

On 12/04/16 14:51, Michał Winiarski wrote:
We started to use PIPE_CONTROL to write render ring seqno in order to
combat seqno write vs interrupt generation problems. This was introduced
by commit 7c17d377374d ("drm/i915: Use ordered seqno write interrupt
generation on gen8+ execlists").

On gen8+ size of PIPE_CONTROL with Post Sync Operation should be
6 dwords. When we're using older 5-dword variant it's possible to
observe inconsistent values written by PIPE_CONTROL with Post
Sync Operation from user batches, resulting in rendering corruptions.

v2: Fix BAT failures
v3: Comments on alignment and thrashing high dword of seqno (Chris)
v4: Updated commit msg (Mika)

Testcase: igt/gem_pipe_control_store_loop/*-qword-write
Issue: VIZ-7393
Cc: stable@xxxxxxxxxxxxxxx
Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxx>
Cc: Abdiel Janulgue <abdiel.janulgue@xxxxxxxxxxxxxxx>
Signed-off-by: Michał Winiarski <michal.winiarski@xxxxxxxxx>
---
  drivers/gpu/drm/i915/intel_lrc.c | 10 ++++++++--
  1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 0d6dc5e..30abe53 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1945,15 +1945,18 @@ static int gen8_emit_request_render(struct drm_i915_gem_request *request)
  	struct intel_ringbuffer *ringbuf = request->ringbuf;
  	int ret;

-	ret = intel_logical_ring_begin(request, 6 + WA_TAIL_DWORDS);
+	ret = intel_logical_ring_begin(request, 8 + WA_TAIL_DWORDS);
  	if (ret)
  		return ret;

+	/* We're using qword write, seqno should be aligned to 8 bytes. */
+	BUILD_BUG_ON(I915_GEM_HWS_INDEX & 1);
+
  	/* w/a for post sync ops following a GPGPU operation we
  	 * need a prior CS_STALL, which is emitted by the flush
  	 * following the batch.
  	 */
-	intel_logical_ring_emit(ringbuf, GFX_OP_PIPE_CONTROL(5));
+	intel_logical_ring_emit(ringbuf, GFX_OP_PIPE_CONTROL(6));
  	intel_logical_ring_emit(ringbuf,
  				(PIPE_CONTROL_GLOBAL_GTT_IVB |
  				 PIPE_CONTROL_CS_STALL |
@@ -1961,7 +1964,10 @@ static int gen8_emit_request_render(struct drm_i915_gem_request *request)
  	intel_logical_ring_emit(ringbuf, hws_seqno_address(request->engine));
  	intel_logical_ring_emit(ringbuf, 0);
  	intel_logical_ring_emit(ringbuf, i915_gem_request_get_seqno(request));
+	/* We're thrashing one dword of HWS. */
+	intel_logical_ring_emit(ringbuf, 0);
  	intel_logical_ring_emit(ringbuf, MI_USER_INTERRUPT);
+	intel_logical_ring_emit(ringbuf, MI_NOOP);
  	return intel_logical_ring_advance_and_submit(request);
  }

In the scheduler+preemption patches, we actually make use of the fact 
that we're writing a QWord, so that we can set the completed-seqno and 
clear the in-progress seqno in one operation (it doesn't actually matter 
if the h/w turns it into two DWord writes, though).

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx