On Fri, May 12, 2017 at 01:55:11PM -0700, Michel Thierry wrote: > On 5/8/2017 11:31 AM, Michel Thierry wrote: > >On 4/29/2017 7:19 AM, Chris Wilson wrote: > >>On Thu, Apr 27, 2017 at 04:12:42PM -0700, Michel Thierry wrote: > >>>From: Arun Siluvery <arun.siluvery@xxxxxxxxxxxxxxx> > >>> > ... > >>>+ } > >>>+ > >>> intel_prepare_reset(dev_priv); > >>> > >>> set_bit(I915_RESET_HANDOFF, &dev_priv->gpu_error.flags); > >>>@@ -2680,6 +2706,7 @@ static void i915_reset_and_wakeup(struct > >>>drm_i915_private *dev_priv) > >>> kobject_uevent_env(kobj, > >>> KOBJ_CHANGE, reset_done_event); > >>> > >>>+finish: > >>> /* > >>> * Note: The wake_up also serves as a memory barrier so that > >>> * waiters see the updated value of the dev_priv->gpu_error. > >>>@@ -2781,7 +2808,7 @@ void i915_handle_error(struct drm_i915_private > >>>*dev_priv, > >>> &dev_priv->gpu_error.flags)) > >>> goto out; > >>> > >>>- i915_reset_and_wakeup(dev_priv); > >>>+ i915_reset_and_wakeup(dev_priv, engine_mask); > >> > >>? You don't need to wakeup the struct_mutex so we don't need this after > >>per-engine resets. Time to split up i915_reset_and_wakeup(), because we > >>certainly shouldn't be calling intel_finish_reset() without first calling > >>intel_prepare_reset(). Which is right here in my tree... > >> > > > >Looking at your tree, it wouldn't call finish_reset there either, only > >these two are called after a successful reset: > > > >finish: > > clear_bit(I915_RESET_BACKOFF, &dev_priv->gpu_error.flags); > > wake_up_all(&dev_priv->gpu_error.reset_queue); > > > >But you're right, we only need to clear the error flag, no need to call > >wake_up_all. > > > >Should I move the per-engine reset to i915_handle_error, and then leave > >i915_reset_and_wakeup just for full resets? > >That would also make the promotion from per-engine to global look a bit > >'clearer'. > > > > I just noticed an issue if I don't call wake_up_all. There can be > someone else waiting for the reset to complete > (i915_mutex_lock_interruptible -> i915_gem_wait_for_error). > > I915_RESET_BACKOFF has/had 2 roles, stop any other user to grab the > struct mutex (which we won't need in reset-engine) and prevent two > concurrent reset attempts (which we still want). Time to add a new > flag for the later? (I915_RESET_ENGINE_IN_PROGRESS?) Yes, that would be a good idea to avoid dual purposing the bits. Now that we do direct resets along the wait path, we can completely drop the i915_mutex_interruptible(). (No one else should be holding the mutex indefinitely.) I think that's a better approach -- I think we've already moved all the EIO magic aware to the ABI points where we deemed it necessary. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx