On 3/19/2018 5:44 PM, Chris Wilson wrote:
Quoting Michel Thierry (2018-03-20 00:39:35)
On 3/19/2018 5:18 PM, Chris Wilson wrote:
Not all callers want the GPU error to handled in the same way, so expose
a control parameter. In the first instance, some callers do not want the
heavyweight error capture so add a bit to request the state to be
captured and saved.
v2: Pass msg down to i915_reset/i915_reset_engine so that we include the
reason for the reset in the dev_notice(), superseding the earlier option
to not print that notice.
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Jeff McGee <jeff.mcgee@xxxxxxxxx>
Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxx>
Cc: Michel Thierry <michel.thierry@xxxxxxxxx>
---
drivers/gpu/drm/i915/i915_debugfs.c | 4 +--
drivers/gpu/drm/i915/i915_drv.c | 17 +++++------
drivers/gpu/drm/i915/i915_drv.h | 10 +++---
drivers/gpu/drm/i915/i915_irq.c | 39 +++++++++++++-----------
drivers/gpu/drm/i915/intel_hangcheck.c | 13 ++++----
drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 13 +++-----
6 files changed, 48 insertions(+), 48 deletions(-)
...
diff --git a/drivers/gpu/drm/i915/intel_hangcheck.c b/drivers/gpu/drm/i915/intel_hangcheck.c
index 42e45ae87393..fd0ffb8328d0 100644
--- a/drivers/gpu/drm/i915/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/intel_hangcheck.c
@@ -246,9 +246,8 @@ engine_stuck(struct intel_engine_cs *engine, u64 acthd)
*/
tmp = I915_READ_CTL(engine);
if (tmp & RING_WAIT) {
- i915_handle_error(dev_priv, 0,
- "Kicking stuck wait on %s",
- engine->name);
+ i915_handle_error(dev_priv, BIT(engine->id), 0,
+ "stuck wait on %s", engine->name);
Before we were not resetting anything here, is this change on purpose?
(if it is, it's worth adding it to the commit msg since it's changing
behavior).
I915_WRITE_CTL(engine, tmp);
return ENGINE_WAIT_KICK;
} > @@ -258,8 +257,8 @@ engine_stuck(struct intel_engine_cs *engine, u64
acthd)
default:
return ENGINE_DEAD;
case 1:
- i915_handle_error(dev_priv, 0,
- "Kicking stuck semaphore on %s",
+ i915_handle_error(dev_priv, ALL_ENGINES, 0,
Same here,
Both are functionally no-op changes, as they are only for !per-engine
platforms (unless someone manages to send just the wrong type of garbage
to the GPU). I just thought it interesting to document that wait-event
needs a local kick and the wait-sema needs to kick the other engines.
i915_handle_error has this before full reset:
if (!engine_mask)
goto out;
No reset at all was happening before.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx