On Mon, Dec 14, 2015 at 11:14:31AM +0000, Dave Gordon wrote: > On 11/12/15 22:59, Chris Wilson wrote: > >The request tells us where to read the ringbuf from, so use that > >information to simplify the error capture. If no request was active at > >the time of the hang, the ring is idle and there is no information > >inside the ring pertaining to the hang. > > > >Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > >--- > > drivers/gpu/drm/i915/i915_gpu_error.c | 29 ++++++++++------------------- > > 1 file changed, 10 insertions(+), 19 deletions(-) > > > >diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c > >index 3e137fc701cf..6eefe9c36931 100644 > >--- a/drivers/gpu/drm/i915/i915_gpu_error.c > >+++ b/drivers/gpu/drm/i915/i915_gpu_error.c > >@@ -995,7 +995,7 @@ static void i915_gem_record_rings(struct drm_device *dev, > > > > for (i = 0; i < I915_NUM_RINGS; i++) { > > struct intel_engine_cs *ring = &dev_priv->ring[i]; > >- struct intel_ringbuffer *rbuf; > >+ struct intel_ringbuffer *rbuf = NULL; > > > > error->ring[i].pid = -1; > > > >@@ -1039,26 +1039,17 @@ static void i915_gem_record_rings(struct drm_device *dev, > > } > > rcu_read_unlock(); > > } > >+ > >+ rbuf = request->ringbuf; > > } > > > >- if (i915.enable_execlists) { > >- /* TODO: This is only a small fix to keep basic error > >- * capture working, but we need to add more information > >- * for it to be useful (e.g. dump the context being > >- * executed). > >- */ > >- if (request) > >- rbuf = request->ctx->engine[ring->id].ringbuf; > >- else > >- rbuf = ring->default_context->engine[ring->id].ringbuf; > >- } else > >- rbuf = ring->buffer; > >- > >- error->ring[i].cpu_ring_head = rbuf->head; > >- error->ring[i].cpu_ring_tail = rbuf->tail; > >- > >- error->ring[i].ringbuffer = > >- i915_error_ggtt_object_create(dev_priv, rbuf->obj); > >+ if (rbuf) { > >+ error->ring[i].cpu_ring_head = rbuf->head; > >+ error->ring[i].cpu_ring_tail = rbuf->tail; > >+ error->ring[i].ringbuffer = > >+ i915_error_ggtt_object_create(dev_priv, > >+ rbuf->obj); > >+ } > > > > error->ring[i].hws_page = > > i915_error_ggtt_object_create(dev_priv, ring->status_page.obj); > > I think the code you deleted is intended to capture the *default* > ringbuffer if there is no request active -- sometimes we will switch > an engine to the default context (and therefore ringbuffer) when > there's no work to be done. So answer the question, why? I don't have a use for it. This code in particular is nothing more than a hack for execlists and in no way reflects my intentions for the postmortem debugging tool. > Another option that's sometimes useful is to capture the ringbuffer > pointed to by the START register. This helps in finding situations > where the driver and the GPU disagree about what should be in > progress. That is a possibitly, except is no more interesting than inspecting the START vs expected (and requires the stop_machine rework to walk the lists without crashing). > I've got a few patches that update some of the error capture that's > always been missing in execlist mode (like, actually capturing the > active context), and add some more decoding of what we do capture. No decoding. That is easier done in userspace. And I sent patches to do more capturing many, many months ago, along with fixing up most of the invalid ppgtt state. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx