On Wed, Dec 04, 2013 at 11:37:09AM +0000, Chris Wilson wrote: > As the rings may be processed and their requests deallocated in a > different order to the natural retirement during a reset, > > /* Whilst this request exists, batch_obj will be on the > * active_list, and so will hold the active reference. Only when this > * request is retired will the the batch_obj be moved onto the > * inactive_list and lose its active reference. Hence we do not need > * to explicitly hold another reference here. > */ > > is violated, and the batch_obj may be dereferenced after it had been > freed on another ring. This can be simply avoided by processing the > status update prior to deallocating any requests. > > Fixes regression (a possible OOPS following a GPU hang) from > commit aa60c664e6df502578454621c3a9b1f087ff8d25 > Author: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > Date: Wed Jun 12 15:13:20 2013 +0300 > > drm/i915: find guilty batch buffer on ring resets > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > --- > drivers/gpu/drm/i915/i915_gem.c | 29 +++++++++++++++++++---------- > 1 file changed, 19 insertions(+), 10 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index ec4502034203..c1e481d36575 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -2442,15 +2442,24 @@ static void i915_gem_free_request(struct drm_i915_gem_request *request) > kfree(request); > } > > -static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv, > - struct intel_ring_buffer *ring) > +static void i915_gem_reset_ring_status(struct drm_i915_private *dev_priv, > + struct intel_ring_buffer *ring) > { > - u32 completed_seqno; > - u32 acthd; > + u32 completed_seqno = ring->get_seqno(ring, false); > + u32 acthd = intel_ring_get_active_head(ring); > + struct drm_i915_gem_request *request; > + > + list_for_each_entry(request, &ring->request_list, list) { > + if (i915_seqno_passed(completed_seqno, request->seqno)) > + continue; > > - acthd = intel_ring_get_active_head(ring); > - completed_seqno = ring->get_seqno(ring, false); > + i915_set_reset_status(ring, request, acthd); > + } > +} Indeed the fix in the gem reset code is a bit simpler than what I've feared. We still have fairly tricky code which depends upon that implicit reference in non-obvious ways. So I still think Mika's refcount patch with the comments updated is the better approach. -Daniel > > +static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv, > + struct intel_ring_buffer *ring) > +{ > while (!list_empty(&ring->request_list)) { > struct drm_i915_gem_request *request; > > @@ -2458,9 +2467,6 @@ static void i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv, > struct drm_i915_gem_request, > list); > > - if (request->seqno > completed_seqno) > - i915_set_reset_status(ring, request, acthd); > - > i915_gem_free_request(request); > } > > @@ -2503,7 +2509,10 @@ void i915_gem_reset(struct drm_device *dev) > int i; > > for_each_ring(ring, dev_priv, i) > - i915_gem_reset_ring_lists(dev_priv, ring); > + i915_gem_reset_ring_status(dev_priv, ring); > + > + for_each_ring(ring, dev_priv, i) > + i915_gem_reset_ring_cleanup(dev_priv, ring); > > i915_gem_cleanup_ringbuffer(dev); > > -- > 1.8.5.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx