On Thu, Dec 05, 2013 at 04:22:02PM +0000, Chris Wilson wrote: > On Thu, Dec 05, 2013 at 06:07:27PM +0200, Mika Kuoppala wrote: > > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > > > As the rings may be processed and their requests deallocated in a > > > different order to the natural retirement during a reset, > > > > > > /* Whilst this request exists, batch_obj will be on the > > > * active_list, and so will hold the active reference. Only when this > > > * request is retired will the the batch_obj be moved onto the > > > * inactive_list and lose its active reference. Hence we do not need > > > * to explicitly hold another reference here. > > > */ > > > > > > is violated, and the batch_obj may be dereferenced after it had been > > > freed on another ring. This can be simply avoided by processing the > > > status update prior to deallocating any requests. > > > > > > Fixes regression (a possible OOPS following a GPU hang) from > > > commit aa60c664e6df502578454621c3a9b1f087ff8d25 > > > Author: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > > > Date: Wed Jun 12 15:13:20 2013 +0300 > > > > > > drm/i915: find guilty batch buffer on ring resets > > > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> > > > Cc: stable@xxxxxxxxxxxxxxx > > > > Passes the igt/gem_reset_stats/close-pending-fork and > > doesn't affect the fast path. > > > > Reviewed-by: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> > > For reference, this is the comment I added upon request: > > @@ -2502,8 +2508,15 @@ void i915_gem_reset(struct drm_device *dev) > struct intel_ring_buffer *ring; > int i; > > + /* Before we free the objects from the requests, we need to inspect > + * them for finding the guilty party. As the requests only borrow > + * their reference to the objects, the inspection must be done first. > + */ > + for_each_ring(ring, dev_priv, i) > + i915_gem_reset_ring_status(dev_priv, ring); > + > for_each_ring(ring, dev_priv, i) > - i915_gem_reset_ring_lists(dev_priv, ring); > + i915_gem_reset_ring_cleanup(dev_priv, ring); > > i915_gem_cleanup_ringbuffer(dev); QA didn't hit this bug since the test wasn't added to the right make target. With that sorted I've now merged this patch (including comment) to -fixes. Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html