Re: [PATCH] drm/i915: Avoid the gpu reset vs. modeset deadlock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/27/2017 5:13 AM, Tvrtko Ursulin wrote:

On 20/07/2017 21:23, Daniel Vetter wrote:
... using the biggest hammer we have. This is essentially a weaponized
version of the timeout-based wedging Chris added in

commit 36703e79a982c8ce5a8e43833291f2719e92d0d1
Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Date:   Thu Jun 22 11:56:25 2017 +0100

      drm/i915: Break modeset deadlocks on reset

Because defense-in-depth is good it's good to still have both. Also
note that with the locking change we can now restrict this a lot (old
gpus and special testing only), so this doesn't kill the TDR benefits
on at least anything remotely modern.

And futuremore with a few tricks it should be possible to make a much
more educated guess about whether an atomic commit is stuck waiting on
the gpu (atomic_t counting the pending i915_sw_fence used by the
atomic modeset code should do it), so we can improve this.

But for now just start with something that is guaranteed to recover
faster, for much better CI througput.

This defacto reverts TDR on these platforms, but there's not really a
single commit to specify as the sole offender.

v2: Add a debug message to explain what's going on. We can't DRM_ERROR
because that spams CI. And the timeout based fallback still prints a
DRM_ERROR, in case something goes wrong.

Fixes: 4680816be336 ("drm/i915: Wait first for submission, before waiting for request completion")
Fixes: 221fe7994554 ("drm/i915: Perform a direct reset of the GPU from the waiter")
Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxx>
Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>
Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx>
---
   drivers/gpu/drm/i915/intel_display.c | 6 ++++++
   1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 02b1f4966049..995522e40ec1 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3471,6 +3471,12 @@ void intel_prepare_reset(struct drm_i915_private *dev_priv)
           !gpu_reset_clobbers_display(dev_priv))
               return;

+     /* We have a modeset vs reset deadlock, defensively unbreak it.
+      *
+      * FIXME: We can do a _lot_ better, this is just a first iteration.*/

I think the standard is to move '*/' to a new line.

+     i915_gem_set_wedged(dev_priv);
+     DRM_DEBUG_DRIVER("Wedging GPU to avoid deadlocks with pending modeset updates\n");
+
       /*
        * Need mode_config.mutex so that we don't
        * trample ongoing ->detect() and whatnot.


Looks fine to me since it only affects very old platforms, and is a
temporary measure.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>


Also,

Reviewed-by: Michel Thierry <michel.thierry@xxxxxxxxx>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux