Quoting Daniele Ceraolo Spurio (2019-09-10 01:59:38) > > > On 9/9/19 3:55 PM, Chris Wilson wrote: > > Unwedging the GPU requires a successful GPU reset before we restore the > > default submission, or else we may see residual context switch events > > that we were not expecting. > > > > Reported-by: Janusz Krzysztofik <janusz.krzysztofik@xxxxxxxxxxxxxxx> > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Janusz Krzysztofik <janusz.krzysztofik@xxxxxxxxxxxxxxx> > > Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/gt/intel_reset.c | 7 ++++++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c > > index fe57296b790c..5242496a893a 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_reset.c > > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c > > @@ -809,6 +809,7 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt) > > struct intel_gt_timelines *timelines = >->timelines; > > struct intel_timeline *tl; > > unsigned long flags; > > + bool ok; > > > > if (!test_bit(I915_WEDGED, >->reset.flags)) > > return true; > > @@ -854,7 +855,11 @@ static bool __intel_gt_unset_wedged(struct intel_gt *gt) > > } > > spin_unlock_irqrestore(&timelines->lock, flags); > > > > - intel_gt_sanitize(gt, false); > > + ok = false; > > + if (!reset_clobbers_display(gt->i915)) > > + ok = __intel_gt_reset(gt, ALL_ENGINES) == 0; > > Of the thing we had in the gt_sanitize, we're ok skipping the > uc_sanitize() because we take care of that during wedge (from > intel_uc_reset_prepare), but what about the loop of > __intel_engine_reset()? Is that safe to skip here? I think yes, because we always follow the unwedge with a GT restart. That is either via the full reset or the sanitize+restart on resume. Both call paths will also set the wedged bit if they fail. gem_eio/suspend should be testing the recovery upon resume path, and even gem_eio/*-stress should give responsible coverage of the normal recovery via full reset. > Apart from that, the patch LGTM. Worth noting that with this change a > successful reset is required to unwedge even after a suspend/resume > cycle (in gem_sanitize), which is a good thing IMO. Hence why relaxing the gpu_clobbers_display is important to retain the ability to clear wedged across suspend on older devices. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx