Re: [PATCH v12] drm/i915: Extend LRC pinning to cover GPU context writeback

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25/01/2016 18:19, Daniel Vetter wrote:
On Fri, Jan 22, 2016 at 02:25:27PM +0000, Nick Hoath wrote:
Use the first retired request on a new context to unpin
the old context. This ensures that the hw context remains
bound until it has been written back to by the GPU.
Now that the context is pinned until later in the request/context
lifecycle, it no longer needs to be pinned from context_queue to
retire_requests.
This fixes an issue with GuC submission where the GPU might not
have finished writing back the context before it is unpinned. This
results in a GPU hang.

v2: Moved the new pin to cover GuC submission (Alex Dai)
     Moved the new unpin to request_retire to fix coverage leak
v3: Added switch to default context if freeing a still pinned
     context just in case the hw was actually still using it
v4: Unwrapped context unpin to allow calling without a request
v5: Only create a switch to idle context if the ring doesn't
     already have a request pending on it (Alex Dai)
     Rename unsaved to dirty to avoid double negatives (Dave Gordon)
     Changed _no_req postfix to __ prefix for consistency (Dave Gordon)
     Split out per engine cleanup from context_free as it
     was getting unwieldy
     Corrected locking (Dave Gordon)
v6: Removed some bikeshedding (Mika Kuoppala)
     Added explanation of the GuC hang that this fixes (Daniel Vetter)
v7: Removed extra per request pinning from ring reset code (Alex Dai)
     Added forced ring unpin/clean in error case in context free (Alex Dai)
v8: Renamed lrc specific last_context to lrc_last_context as there
     were some reset cases where the codepaths leaked (Mika Kuoppala)
     NULL'd last_context in reset case - there was a pointer leak
     if someone did reset->close context.
v9: Rebase over "Fix context/engine cleanup order"
v10: Rebase over nightly, remove WARN_ON which caused the
     dependency on dev.
v11: Kick BAT rerun
v12: Rebase

Signed-off-by: Nick Hoath <nicholas.hoath@xxxxxxxxx>
Issue: VIZ-4277

When resending patches, please include everyone who ever commented on this
in Cc: lines here. It's for the record and helps in assigning blame when
things inevitably blow up again ;-)

Even when it's just a resend to cause a BAT run for coverage?

-Daniel

---
  drivers/gpu/drm/i915/intel_lrc.c | 37 +++++++++++++++----------------------
  1 file changed, 15 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index dbf3729..b469817 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -779,10 +779,10 @@ intel_logical_ring_advance_and_submit(struct drm_i915_gem_request *request)
  	if (intel_ring_stopped(request->ring))
  		return 0;

-	if (request->ctx != ring->default_context) {
-		if (!request->ctx->engine[ring->id].dirty) {
+	if (request->ctx != request->ctx->i915->kernel_context) {
+		if (!request->ctx->engine[request->ring->id].dirty) {
  			intel_lr_context_pin(request);
-			request->ctx->engine[ring->id].dirty = true;
+			request->ctx->engine[request->ring->id].dirty = true;
  		}
  	}

@@ -2447,9 +2447,7 @@ intel_lr_context_clean_ring(struct intel_context *ctx,
  			    struct drm_i915_gem_object *ctx_obj,
  			    struct intel_ringbuffer *ringbuf)
  {
-	int ret;
-
-	if (ctx == ring->default_context) {
+	if (ctx == ctx->i915->kernel_context) {
  		intel_unpin_ringbuffer_obj(ringbuf);
  		i915_gem_object_ggtt_unpin(ctx_obj);
  	}
@@ -2463,13 +2461,10 @@ intel_lr_context_clean_ring(struct intel_context *ctx,
  		 * otherwise create a switch to idle request
  		 */
  		if (list_empty(&ring->request_list)) {
-			int ret;
-
-			ret = i915_gem_request_alloc(
+			req = i915_gem_request_alloc(
  					ring,
-					ring->default_context,
-					&req);
-			if (!ret)
+					NULL);
+			if (!IS_ERR(req))
  				i915_add_request(req);
  			else
  				DRM_DEBUG("Failed to ensure context saved");
@@ -2479,6 +2474,8 @@ intel_lr_context_clean_ring(struct intel_context *ctx,
  					typeof(*req), list);
  		}
  		if (req) {
+			int ret;
+
  			ret = i915_wait_request(req);
  			if (ret != 0) {
  				/**
@@ -2515,17 +2512,13 @@ void intel_lr_context_free(struct intel_context *ctx)
  		struct intel_ringbuffer *ringbuf = ctx->engine[i].ringbuf;
  		struct drm_i915_gem_object *ctx_obj = ctx->engine[i].state;

-		if (!ctx_obj)
-			continue;
-
-		if (ctx == ctx->i915->kernel_context) {
-			intel_unpin_ringbuffer_obj(ringbuf);
-			i915_gem_object_ggtt_unpin(ctx_obj);
-		}
+		if (ctx_obj)
+			intel_lr_context_clean_ring(
+						ctx,
+						ringbuf->ring,
+						ctx_obj,
+						ringbuf);

-		WARN_ON(ctx->engine[i].pin_count);
-		intel_ringbuffer_free(ringbuf);
-		drm_gem_object_unreference(&ctx_obj->base);
  	}
  }

--
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux