On Wed, Jul 19, 2017 at 3:44 PM, Daniel Vetter <daniel.vetter@xxxxxxxx> wrote: > On Wed, Jul 19, 2017 at 3:32 PM, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote: >> Quoting Daniel Vetter (2017-07-19 13:54:56) >>> ... using the biggest hammer we have. This is essentially a weaponized >>> version of the timeout-based wedging Chris added in >>> >>> commit 36703e79a982c8ce5a8e43833291f2719e92d0d1 >>> Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> >>> Date: Thu Jun 22 11:56:25 2017 +0100 >>> >>> drm/i915: Break modeset deadlocks on reset >>> >>> Because defense-in-depth is good it's good to still have both. Also >>> note that with the locking change we can now restrict this a lot (old >>> gpus and special testing only), so this doesn't kill the TDR benefits >>> on at least anything remotely modern. >>> >>> And futuremore with a few tricks it should be possible to make a much >>> more educated guess about whether an atomic commit is stuck waiting on >>> the gpu (atomic_t counting the pending i915_sw_fence used by the >>> atomic modeset code should do it), so we can improve this. >>> >>> But for now just start with something that is guaranteed to recover >>> faster, for much better CI througput. >>> >>> This defacto reverts TDR on these platforms, but there's not really a >>> single commit to specify as the sole offender. >>> >>> Fixes: 4680816be336 ("drm/i915: Wait first for submission, before waiting for request completion") >>> Fixes: 221fe7994554 ("drm/i915: Perform a direct reset of the GPU from the waiter") >>> Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> >>> Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxx> >>> Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> >>> Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx> >>> --- >>> drivers/gpu/drm/i915/intel_display.c | 5 +++++ >>> 1 file changed, 5 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c >>> index 97777ffa1566..010a1f3e000c 100644 >>> --- a/drivers/gpu/drm/i915/intel_display.c >>> +++ b/drivers/gpu/drm/i915/intel_display.c >>> @@ -3471,6 +3471,11 @@ void intel_prepare_reset(struct drm_i915_private *dev_priv) >>> !gpu_reset_clobbers_display(dev_priv)) >>> return; >>> >>> + /* We have a modeset vs reset deadlock, defensively unbreak it. >>> + * >>> + * FIXME: We can do a _lot_ better, this is just a first iteration.*/ >> >> You should keep the error message for wedging the device. If all goes >> well it is removed in a few patches, so shouldn't be an eyesore for >> long. > > Yeah makes sense, just in case the next patches need to be reverted > for some reasons. That's why I split it ou. Something like > > DRM_ERROR("Wedging gpu to unblock modesets\n"); > > and then remove that again 2 patches later? After thinking about this a bit more, s/DRM_ERROR/DRM_DEBUG_KMS/ or so. We know we can deadlock here, complaining about that only spams dmesg and results in noise in CI, hiding worse stuff. The timeout based wedging as fallback should have a DRM_ERROR since it's unexpected, this one here imo sholdn't. And with the follow-up patches it won't be unconditional anymore (if we don't have to revert them again). -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx