On Wed, Jan 13, 2016 at 05:28:17PM +0000, Arun Siluvery wrote: > From: Tomas Elf <tomas.elf@xxxxxxxxx> > > i915_gem_wedge now returns a non-zero result in three different cases: > > 1. Legacy: A hang has been detected and full GPU reset is in progress. > > 2. Per-engine recovery: > > a. A single engine reference can be passed to the function, in which > case only that engine will be checked. If that particular engine is > detected to be hung and is to be reset this will yield a non-zero > result but not if reset is in progress for any other engine. > > b. No engine reference is passed to the function, in which case all > engines are checked for ongoing per-engine hang recovery. > > Also, i915_wait_request was updated to take advantage of this new > functionality. This is important since the TDR hang recovery mechanism needs a > way to force waiting threads that hold the struct_mutex to give up the > struct_mutex and try again after the hang recovery has completed. If > i915_wait_request does not take per-engine hang recovery into account there is > no way for a waiting thread to know that a per-engine recovery is about to > happen and that it needs to back off. > > Signed-off-by: Tomas Elf <tomas.elf@xxxxxxxxx> > Signed-off-by: Arun Siluvery <arun.siluvery@xxxxxxxxx> > Signed-off-by: Ian Lister <ian.lister@xxxxxxxxx> > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > --- > drivers/gpu/drm/i915/i915_drv.h | 3 +- > drivers/gpu/drm/i915/i915_gem.c | 60 +++++++++++++++++++++++++++------ > drivers/gpu/drm/i915/intel_lrc.c | 4 +-- > drivers/gpu/drm/i915/intel_ringbuffer.c | 4 ++- > 4 files changed, 56 insertions(+), 15 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 85cf692..5be7d3e 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -3033,7 +3033,8 @@ i915_gem_find_active_request(struct intel_engine_cs *ring); > > bool i915_gem_retire_requests(struct drm_device *dev); > void i915_gem_retire_requests_ring(struct intel_engine_cs *ring); > -int __must_check i915_gem_check_wedge(struct i915_gpu_error *error, > +int __must_check i915_gem_check_wedge(struct drm_i915_private *dev_priv, > + struct intel_engine_cs *engine, > bool interruptible); > > static inline bool i915_reset_in_progress(struct i915_gpu_error *error) > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index e3cfed2..e6eb45d 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -80,12 +80,38 @@ static void i915_gem_info_remove_obj(struct drm_i915_private *dev_priv, > spin_unlock(&dev_priv->mm.object_stat_lock); > } > > +static inline int > +i915_engine_reset_in_progress(struct drm_i915_private *dev_priv, > + struct intel_engine_cs *engine) > +{ > + int ret = 0; > + > + if (engine) { > + ret = !!(atomic_read(&dev_priv->ring[engine->id].hangcheck.flags) > + & I915_ENGINE_RESET_IN_PROGRESS); > + } else { > + int i; > + > + for (i = 0; i < I915_NUM_RINGS; i++) > + if (atomic_read(&dev_priv->ring[i].hangcheck.flags) > + & I915_ENGINE_RESET_IN_PROGRESS) { > + > + ret = 1; > + break; > + } > + } Since this side will be called far more often than the writer, could you not make this more convenient for the reader and move it to a global set of flags in dev_priv->gpu_error? To avoid regressing on the EIO front, the waiter sequence should look like if (req->reset_counter != i915_reset_counter(&req->i915->gpu_error)) return 0; if (flags & LOCKED && i915_engine_reset_in_process(&req->i915->gpu_error, req->engine)) return -EAGAIN; Oh, and don't add a second boolean to __i915_wait_request, just transform the first into flags. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx