We need to flush our srcu protecting resources about to be clobbered by the reset, inside of our timer failsafe but outside of the error->wedge_mutex, so that the failsafe can run in case the synchronize_srcu() takes too long (hits a shrinker deadlock?). Fixes: 72eb16df010a ("drm/i915: Serialise resets with wedging") References: https://bugs.freedesktop.org/show_bug.cgi?id=109605 Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> --- drivers/gpu/drm/i915/i915_reset.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/i915_reset.c index 9494b015185a..c2b7570730c2 100644 --- a/drivers/gpu/drm/i915/i915_reset.c +++ b/drivers/gpu/drm/i915/i915_reset.c @@ -941,9 +941,6 @@ static int do_reset(struct drm_i915_private *i915, unsigned int stalled_mask) { int err, i; - /* Flush everyone currently using a resource about to be clobbered */ - synchronize_srcu(&i915->gpu_error.reset_backoff_srcu); - err = intel_gpu_reset(i915, ALL_ENGINES); for (i = 0; err && i < RESET_MAX_RETRIES; i++) { msleep(10 * (i + 1)); @@ -1140,6 +1137,9 @@ static void i915_reset_device(struct drm_i915_private *i915, i915_wedge_on_timeout(&w, i915, 5 * HZ) { intel_prepare_reset(i915); + /* Flush everyone using a resource about to be clobbered */ + synchronize_srcu(&error->reset_backoff_srcu); + mutex_lock(&error->wedge_mutex); i915_reset(i915, engine_mask, reason); mutex_unlock(&error->wedge_mutex); -- 2.20.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx