Since commit 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete requests"), setting the device as wedged is permanent as we cannot recover the engine->submit_request. Stop clearing the I915_WEDGED status to prevent userspace can getting itself in a muddle. To fix this correctly, we need to stop overriding engine->submit_request for the inflight requests and instead need to track the errors in flight. In the meantime, let's start with the correctness fix. Fixes: 821ed7df6e2a ("drm/i915: Update reset path to fix incomplete requests") Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> Cc: <stable@xxxxxxxxxxxxxxx> # v4.9+ --- drivers/gpu/drm/i915/i915_drv.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index b1e9027a4f80..1c4f0a21eb22 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1824,8 +1824,11 @@ void i915_reset(struct drm_i915_private *dev_priv) if (!test_and_clear_bit(I915_RESET_IN_PROGRESS, &error->flags)) return; - /* Clear any previous failed attempts at recovery. Time to try again. */ - __clear_bit(I915_WEDGED, &error->flags); + if (test_bit(I915_WEDGED, &error->flags)) { + wake_up_bit(&error->flags, I915_RESET_IN_PROGRESS); + goto out; + } + error->reset_count++; pr_notice("drm/i915: Resetting chip after gpu hang\n"); @@ -1874,6 +1877,7 @@ void i915_reset(struct drm_i915_private *dev_priv) wakeup: i915_gem_reset_finish(dev_priv); enable_irq(dev_priv->drm.irq); +out: wake_up_bit(&error->flags, I915_RESET_IN_PROGRESS); return; -- 2.11.0