Quoting Rodrigo Vivi (2018-07-05 21:44:56) > On Thu, Jul 05, 2018 at 04:02:14PM +0100, Chris Wilson wrote: > > If the GPU is irrecoverably wedged on startup, it means that it failed > > on initialisation and we have already tried to reset it but failed. We > > can ignore all further testing, as it is already dead. Failing early, > > prevents us from slowly failing in our endeavours later and timing out. > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > --- > > drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c > > index fe7d3190ebfe..fca073c96c2d 100644 > > --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c > > +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c > > @@ -1243,6 +1243,9 @@ int intel_hangcheck_live_selftests(struct drm_i915_private *i915) > > if (!intel_has_gpu_reset(i915)) > > return 0; > > > > + if (i915_terminally_wedged(&i915->gpu_error)) > > + return -EIO; /* we're long past hope of a successful reset */ > > + > > Maybe -ENOTRECOVERABLE ? Interesting choice, our convention so far has been -EIO for losing state due to a GPU hang, but an extra flavour for when we wedge the driver? Hmm, fence->error needs to remain -EIO (differentiating that between reset/wedge for userspace seems to convey no more information imo), and we've already baked if (i915_terminally_wedged(&i915->gpu_error)) return -EIO; into the abi for the points of interest. Sadly too late, I don't think we can pick another errno for the cases it actually matter. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx