Re: [PATCH 2/2] drm/i915/gt: Only unwedge if we can reset first

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 9/9/19 11:06 PM, Chris Wilson wrote:
Quoting Daniele Ceraolo Spurio (2019-09-10 01:59:38)


On 9/9/19 3:55 PM, Chris Wilson wrote:
Unwedging the GPU requires a successful GPU reset before we restore the
default submission, or else we may see residual context switch events
that we were not expecting.

Reported-by: Janusz Krzysztofik <janusz.krzysztofik@xxxxxxxxxxxxxxx>
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Janusz Krzysztofik <janusz.krzysztofik@xxxxxxxxxxxxxxx>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx>
---
   drivers/gpu/drm/i915/gt/intel_reset.c | 7 ++++++-
   1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c
index fe57296b790c..5242496a893a 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -809,6 +809,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
       struct intel_gt_timelines *timelines = &gt->timelines;
       struct intel_timeline *tl;
       unsigned long flags;
+     bool ok;
if (!test_bit(I915_WEDGED, &gt->reset.flags))
               return true;
@@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt)
       }
       spin_unlock_irqrestore(&timelines->lock, flags);
- intel_gt_sanitize(gt, false);
+     ok = false;
+     if (!reset_clobbers_display(gt->i915))
+             ok = __intel_gt_reset(gt, ALL_ENGINES) == 0;

Of the thing we had in the gt_sanitize, we're ok skipping the
uc_sanitize() because we take care of that during wedge (from
intel_uc_reset_prepare), but what about the loop of
__intel_engine_reset()? Is that safe to skip here?

I think yes, because we always follow the unwedge with a GT restart. That
is either via the full reset or the sanitize+restart on resume. Both call
paths will also set the wedged bit if they fail. gem_eio/suspend should
be testing the recovery upon resume path, and even gem_eio/*-stress
should give responsible coverage of the normal recovery via full reset.
Apart from that, the patch LGTM. Worth noting that with this change a
successful reset is required to unwedge even after a suspend/resume
cycle (in gem_sanitize), which is a good thing IMO.

Hence why relaxing the gpu_clobbers_display is important to retain the
ability to clear wedged across suspend on older devices.
-Chris


Sold!

Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx>

Daniele
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux