Re: [RFC PATCH] drm/i915/debugfs: Only wedge if we have reset available

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 02/10/2019 13:48, Janusz Krzysztofik wrote:
If we process DROP_RESET_ACTIVE and cancel all outstanding requests by
forcing a GPU reset on a hardware with reset capabilities disabled or
not supported, we certainly end up with a terminally wedged GPU,
impossible to recover.  That's probably not what we want.

I forgot the whole background story here I'm afraid. Is the concern here the IGT exit handler calling DROP_RESET_ACTIVE? If so with this patch it will fail with -EBUSY, which could be fine, but what happens from the perspective of next test which gets to run? It won't find a wedged GPU, but will encounter a possibly nondeterministic amount of GPU work scheduled before it, no?

Regards,

Tvrtko

Before setting the GPU wedged, verify if we have GPU reset available
and fail with -EBUSY if not.

Suggested-by: Petri Latvala <petri.latvala@xxxxxxxxx>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@xxxxxxxxxxxxxxx>
Cc: Michał Wajdeczko <michal.wajdeczko@xxxxxxxxx>
Cc: Michał Winiarski <michal.winiarski@xxxxxxxxx>
Cc: Piotr Piórkowski <piotr.piorkowski@xxxxxxxxx>
Cc: Tomasz Lis <tomasz.lis@xxxxxxxxx>
Cc: Petri Latvala <petri.latvala@xxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Cc: Martin Peres <martin.peres@xxxxxxxxxxxxxxx>
---
  drivers/gpu/drm/i915/i915_debugfs.c | 11 ++++++++++-
  1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index fec9fb7cc384..0774ca6e2a05 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3627,8 +3627,17 @@ i915_drop_caches_set(void *data, u64 val)
if (val & DROP_RESET_ACTIVE &&
  	    wait_for(intel_engines_are_idle(&i915->gt),
-		     I915_IDLE_ENGINES_TIMEOUT))
+		     I915_IDLE_ENGINES_TIMEOUT)) {
+		/*
+		 * Only wedge if reset is supported and not disabled, otherwise
+		 * we certainly end up with the GPU terminally wedged.  Inform
+		 * userspace about the problem instead.
+		 */
+		if (!intel_has_gpu_reset(&i915->gt))
+			return -EBUSY;
+
  		intel_gt_set_wedged(&i915->gt);
+	}
/* No need to check and wait for gpu resets, only libdrm auto-restarts
  	 * on ioctls on -EAGAIN. */

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux