Re: [PATCH v2 1/3] drm/i915: Record the ringbuffer associated with the request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/12/15 22:59, Chris Wilson wrote:
The request tells us where to read the ringbuf from, so use that
information to simplify the error capture. If no request was active at
the time of the hang, the ring is idle and there is no information
inside the ring pertaining to the hang.

Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
---
  drivers/gpu/drm/i915/i915_gpu_error.c | 29 ++++++++++-------------------
  1 file changed, 10 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 3e137fc701cf..6eefe9c36931 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -995,7 +995,7 @@ static void i915_gem_record_rings(struct drm_device *dev,

  	for (i = 0; i < I915_NUM_RINGS; i++) {
  		struct intel_engine_cs *ring = &dev_priv->ring[i];
-		struct intel_ringbuffer *rbuf;
+		struct intel_ringbuffer *rbuf = NULL;

  		error->ring[i].pid = -1;

@@ -1039,26 +1039,17 @@ static void i915_gem_record_rings(struct drm_device *dev,
  				}
  				rcu_read_unlock();
  			}
+
+			rbuf = request->ringbuf;
  		}

-		if (i915.enable_execlists) {
-			/* TODO: This is only a small fix to keep basic error
-			 * capture working, but we need to add more information
-			 * for it to be useful (e.g. dump the context being
-			 * executed).
-			 */
-			if (request)
-				rbuf = request->ctx->engine[ring->id].ringbuf;
-			else
-				rbuf = ring->default_context->engine[ring->id].ringbuf;
-		} else
-			rbuf = ring->buffer;
-
-		error->ring[i].cpu_ring_head = rbuf->head;
-		error->ring[i].cpu_ring_tail = rbuf->tail;
-
-		error->ring[i].ringbuffer =
-			i915_error_ggtt_object_create(dev_priv, rbuf->obj);
+		if (rbuf) {
+			error->ring[i].cpu_ring_head = rbuf->head;
+			error->ring[i].cpu_ring_tail = rbuf->tail;
+			error->ring[i].ringbuffer =
+				i915_error_ggtt_object_create(dev_priv,
+							      rbuf->obj);
+		}

  		error->ring[i].hws_page =
  			i915_error_ggtt_object_create(dev_priv, ring->status_page.obj);

I think the code you deleted is intended to capture the *default* ringbuffer if there is no request active -- sometimes we will switch an engine to the default context (and therefore ringbuffer) when there's no work to be done.

Another option that's sometimes useful is to capture the ringbuffer pointed to by the START register. This helps in finding situations where the driver and the GPU disagree about what should be in progress.

I've got a few patches that update some of the error capture that's always been missing in execlist mode (like, actually capturing the active context), and add some more decoding of what we do capture. John Harrison posted them as part of the "Preemption support for GPU scheduler" patchset last week, although they're not really anything to do with preemption per se.

One of them "drm/i915/error: improve CSB reporting" also updates this area of the code, so maybe I should incorporate your change into the next revision of that patch?

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux