On Fri, Jan 24, 2014 at 01:50:25PM +0200, Ville Syrjälä wrote: > On Thu, Jan 23, 2014 at 09:49:43PM +0000, Chris Wilson wrote: > > Currently we report through our error state only the rings that have > > been initialised (as detected by ring->obj). This check is done after > > the GPU reset and ring re-initialisation, which means that the software > > state may not be the same as when we captured the hardware error and we > > may not print out any of the vital information for debugging the hang. > > > > This (and the implied object leak) is a regression from > > > > commit 3d57e5bd1284f44e325f3a52d966259ed42f9e05 > > Author: Ben Widawsky <ben@xxxxxxxxxxxx> > > Date: Mon Oct 14 10:01:36 2013 -0700 > > > > drm/i915: Do a fuller init after reset > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Ben Widawsky <ben@xxxxxxxxxxxx> > > --- > > drivers/gpu/drm/i915/i915_drv.h | 1 + > > drivers/gpu/drm/i915/i915_gpu_error.c | 19 +++++++++++++------ > > 2 files changed, 14 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > > index c45cbbecd66a..64a1aca7804d 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.h > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > @@ -334,6 +334,7 @@ struct drm_i915_error_state { > > struct timeval time; > > > > struct drm_i915_error_ring { > > + int valid; > > bool in a struct? I tend to think it leads to laziness not to coalesce them into bitfields. > > - obj = error->ring[i].ctx; > > - if (obj) { > > + if ((obj = error->ring[i].ctx)) { > > Unrelated change. Although it does make this more consistent w/ the > surrouding code. But I admit to not being a fan of assignments inside > if statements. The inconsistency was uglier. > > err_printf(m, "%s --- HW Context = 0x%08x\n", > > dev_priv->ring[i].name, > > obj->gtt_offset); > > @@ -826,11 +827,17 @@ static void i915_gem_record_rings(struct drm_device *dev, > > struct drm_i915_error_state *error) > > { > > struct drm_i915_private *dev_priv = dev->dev_private; > > - struct intel_ring_buffer *ring; > > struct drm_i915_gem_request *request; > > int i, count; > > > > - for_each_ring(ring, dev_priv, i) { > > + for (i = 0; i < I915_NUM_RINGS; i++) { > > + struct intel_ring_buffer *ring = &dev_priv->ring[i]; > > + > > + if (ring->dev == NULL) > > + continue; > > + > > + error->ring[i].valid = true; > > + > > The code here runs before the reset, and it would actually oops if > ring->obj==NULL, so using for_each_ring() here looks appropriate. No, we need to record that ring->obj is NULL, especially if the ring registers are still set... -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx