Re: drm/i915: Watchdog timeout: IRQ handler for gen8+

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Mon, 7 Jan 2019 11:58:13 +0000

Hi,

This series has not been recognized by Patchwork as such, nor are the 
patches numbered. Have you used git format-patch -<N> --cover-letter and 
git send-email to send it out?

Rest inline.

On 05/01/2019 02:39, Carlos Santa wrote:
From: Michel Thierry <michel.thierry@xxxxxxxxx>

*** General ***

Watchdog timeout (or "media engine reset") is a feature that allows
userland applications to enable hang detection on individual batch buffers.
The detection mechanism itself is mostly bound to the hardware and the only
thing that the driver needs to do to support this form of hang detection
is to implement the interrupt handling support as well as watchdog command
emission before and after the emitted batch buffer start instruction in the
ring buffer.

The principle of the hang detection mechanism is as follows:

1. Once the decision has been made to enable watchdog timeout for a
particular batch buffer and the driver is in the process of emitting the
batch buffer start instruction into the ring buffer it also emits a
watchdog timer start instruction before and a watchdog timer cancellation
instruction after the batch buffer start instruction in the ring buffer.

2. Once the GPU execution reaches the watchdog timer start instruction
the hardware watchdog counter is started by the hardware. The counter
keeps counting until either reaching a previously configured threshold
value or the timer cancellation instruction is executed.

2a. If the counter reaches the threshold value the hardware fires a
watchdog interrupt that is picked up by the watchdog interrupt handler.
This means that a hang has been detected and the driver needs to deal with
it the same way it would deal with a engine hang detected by the periodic
hang checker. The only difference between the two is that we already blamed
the active request (to ensure an engine reset).

What happens if the watchdog fires but the "guilty" request completes 
before the interrupt has been delivered, or acted upon? Would that mean 
an innocent request could be blamed for the timeout? Maybe the answer 
comes later in the patch/series.


2b. If the batch buffer completes and the execution reaches the watchdog
cancellation instruction before the watchdog counter reaches its
threshold value the watchdog is cancelled and nothing more comes of it.
No hang is detected.

Note about future interaction with preemption: Preemption could happen
in a command sequence prior to watchdog counter getting disabled,
resulting in watchdog being triggered following preemption (e.g. when
watchdog had been enabled in the low priority batch). The driver will
need to explicitly disable the watchdog counter as part of the
preemption sequence.

Does the series take care of preemption?


*** This patch introduces: ***

1. IRQ handler code for watchdog timeout allowing direct hang recovery
based on hardware-driven hang detection, which then integrates directly
with the hang recovery path. This is independent of having per-engine reset
or just full gpu reset.

2. Watchdog specific register information.

Currently the render engine and all available media engines support
watchdog timeout (VECS is only supported in GEN9). The specifications elude
to the BCS engine being supported but that is currently not supported by
this commit.

Note that the value to stop the counter is different between render and
non-render engines in GEN8; GEN9 onwards it's the same.

v2: Move irq handler to tasklet, arm watchdog for a 2nd time to check
against false-positives.

v3: Don't use high priority tasklet, use engine_last_submit while
checking for false-positives. From GEN9 onwards, the stop counter bit is
the same for all engines.

v4: Remove unnecessary brackets, use current_seqno to mark the request
as guilty in the hangcheck/capture code.

v5: Rebased after RESET_ENGINEs flag.

v6: Don't capture error state in case of watchdog timeout. The capture
process is time consuming and this will align to what happens when we
use GuC to handle the watchdog timeout. (Chris)

v7: Rebase.

v8: Rebase, use HZ to reschedule.

v9: Rebase, get forcewake domains in function (no longer in execlists
struct).

v10: Rebase.

Cc: Antonio Argenziano <antonio.argenziano@xxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx>
Signed-off-by: Michel Thierry <michel.thierry@xxxxxxxxx>
Signed-off-by: Carlos Santa <carlos.santa@xxxxxxxxx>
---
  drivers/gpu/drm/i915/i915_gpu_error.h   |  4 ++
  drivers/gpu/drm/i915/i915_irq.c         | 14 +++-
  drivers/gpu/drm/i915/i915_reg.h         |  6 ++
  drivers/gpu/drm/i915/intel_hangcheck.c  | 17 +++--
  drivers/gpu/drm/i915/intel_lrc.c        | 86 +++++++++++++++++++++++++
  drivers/gpu/drm/i915/intel_ringbuffer.h |  6 ++
  6 files changed, 126 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
index 6d9f45468ac1..7130786aa5b4 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -256,6 +256,9 @@ struct i915_gpu_error {
  	 * inspect the bit and do the reset directly, otherwise the worker
  	 * waits for the struct_mutex.
  	 *
+	 * #I915_RESET_WATCHDOG - When hw detects a hang before us, we can use
+	 * I915_RESET_WATCHDOG to report the hang detection cause accurately.
+	 *
  	 * #I915_RESET_ENGINE[num_engines] - Since the driver doesn't need to
  	 * acquire the struct_mutex to reset an engine, we need an explicit
  	 * flag to prevent two concurrent reset attempts in the same engine.
@@ -271,6 +274,7 @@ struct i915_gpu_error {
  #define I915_RESET_BACKOFF	0
  #define I915_RESET_HANDOFF	1
  #define I915_RESET_MODESET	2
+#define I915_RESET_WATCHDOG	3
  #define I915_WEDGED		(BITS_PER_LONG - 1)
  #define I915_RESET_ENGINE	(I915_WEDGED - I915_NUM_ENGINES)
  
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index fbb094ecf6c9..859bbadb752f 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1498,6 +1498,9 @@ gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir)
  
  	if (tasklet)
  		tasklet_hi_schedule(&engine->execlists.tasklet);
+
+	if (iir & (GT_GEN8_WATCHDOG_INTERRUPT))

Braces are not needed.

+		tasklet_schedule(&engine->execlists.watchdog_tasklet);
  }
  
  static void gen8_gt_irq_ack(struct drm_i915_private *i915,
@@ -3329,7 +3332,7 @@ void i915_handle_error(struct drm_i915_private *dev_priv,
  	if (intel_has_reset_engine(dev_priv) &&
  	    !i915_terminally_wedged(&dev_priv->gpu_error)) {
  		for_each_engine_masked(engine, dev_priv, engine_mask, tmp) {
-			BUILD_BUG_ON(I915_RESET_MODESET >= I915_RESET_ENGINE);
+			BUILD_BUG_ON(I915_RESET_WATCHDOG >= I915_RESET_ENGINE);
  			if (test_and_set_bit(I915_RESET_ENGINE + engine->id,
  					     &dev_priv->gpu_error.flags))
  				continue;
@@ -4162,12 +4165,15 @@ static void gen8_gt_irq_postinstall(struct drm_i915_private *dev_priv)
  	uint32_t gt_interrupts[] = {
  		GT_RENDER_USER_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
  			GT_CONTEXT_SWITCH_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
+			GT_GEN8_WATCHDOG_INTERRUPT << GEN8_RCS_IRQ_SHIFT |
  			GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT |
  			GT_CONTEXT_SWITCH_INTERRUPT << GEN8_BCS_IRQ_SHIFT,
  		GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT |
  			GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS1_IRQ_SHIFT |
+			GT_GEN8_WATCHDOG_INTERRUPT << GEN8_VCS1_IRQ_SHIFT |
  			GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT |
-			GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS2_IRQ_SHIFT,
+			GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS2_IRQ_SHIFT |
+			GT_GEN8_WATCHDOG_INTERRUPT << GEN8_VCS2_IRQ_SHIFT,
  		0,
  		GT_RENDER_USER_INTERRUPT << GEN8_VECS_IRQ_SHIFT |
  			GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT
@@ -4176,6 +4182,10 @@ static void gen8_gt_irq_postinstall(struct drm_i915_private *dev_priv)
  	if (HAS_L3_DPF(dev_priv))
  		gt_interrupts[0] |= GT_RENDER_L3_PARITY_ERROR_INTERRUPT;
  
+	/* VECS watchdog is only available in skl+ */
+	if (INTEL_GEN(dev_priv) >= 9)
+		gt_interrupts[3] |= GT_GEN8_WATCHDOG_INTERRUPT;

Is the shift missing here?

+
  	dev_priv->pm_ier = 0x0;
  	dev_priv->pm_imr = ~dev_priv->pm_ier;
  	GEN8_IRQ_INIT_NDX(GT, 0, ~gt_interrupts[0], gt_interrupts[0]);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 44958d994bfa..fff330643090 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2335,6 +2335,11 @@ enum i915_power_well_id {
  #define RING_START(base)	_MMIO((base) + 0x38)
  #define RING_CTL(base)		_MMIO((base) + 0x3c)
  #define   RING_CTL_SIZE(size)	((size) - PAGE_SIZE) /* in bytes -> pages */
+#define RING_CNTR(base)		_MMIO((base) + 0x178)
+#define   GEN8_WATCHDOG_ENABLE		0
+#define   GEN8_WATCHDOG_DISABLE		1
+#define   GEN8_XCS_WATCHDOG_DISABLE	0xFFFFFFFF /* GEN8 & non-render only */
+#define RING_THRESH(base)	_MMIO((base) + 0x17C)
  #define RING_SYNC_0(base)	_MMIO((base) + 0x40)
  #define RING_SYNC_1(base)	_MMIO((base) + 0x44)
  #define RING_SYNC_2(base)	_MMIO((base) + 0x48)
@@ -2894,6 +2899,7 @@ enum i915_power_well_id {
  #define GT_BSD_USER_INTERRUPT			(1 << 12)
  #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1	(1 << 11) /* hsw+; rsvd on snb, ivb, vlv */
  #define GT_CONTEXT_SWITCH_INTERRUPT		(1 <<  8)
+#define GT_GEN8_WATCHDOG_INTERRUPT		(1 <<  6) /* gen8+ */
  #define GT_RENDER_L3_PARITY_ERROR_INTERRUPT	(1 <<  5) /* !snb */
  #define GT_RENDER_PIPECTL_NOTIFY_INTERRUPT	(1 <<  4)
  #define GT_RENDER_CS_MASTER_ERROR_INTERRUPT	(1 <<  3)
diff --git a/drivers/gpu/drm/i915/intel_hangcheck.c b/drivers/gpu/drm/i915/intel_hangcheck.c
index 51e9efec5116..2906f0ef3d77 100644
--- a/drivers/gpu/drm/i915/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/intel_hangcheck.c
@@ -213,7 +213,8 @@ static void hangcheck_accumulate_sample(struct intel_engine_cs *engine,
  
  static void hangcheck_declare_hang(struct drm_i915_private *i915,
  				   unsigned int hung,
-				   unsigned int stuck)
+				   unsigned int stuck,
+				   unsigned int watchdog)
  {
  	struct intel_engine_cs *engine;
  	char msg[80];
@@ -226,13 +227,16 @@ static void hangcheck_declare_hang(struct drm_i915_private *i915,
  	if (stuck != hung)
  		hung &= ~stuck;
  	len = scnprintf(msg, sizeof(msg),
-			"%s on ", stuck == hung ? "no progress" : "hang");
+			"%s on ", watchdog ? "watchdog timeout" :
+				  stuck == hung ? "no progress" : "hang");
  	for_each_engine_masked(engine, i915, hung, tmp)
  		len += scnprintf(msg + len, sizeof(msg) - len,
  				 "%s, ", engine->name);
  	msg[len-2] = '\0';
  
-	return i915_handle_error(i915, hung, I915_ERROR_CAPTURE, "%s", msg);
+	return i915_handle_error(i915, hung,
+				 watchdog ? 0 : I915_ERROR_CAPTURE,
+				 "%s", msg);
  }
  
  /*
@@ -250,7 +254,7 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
  			     gpu_error.hangcheck_work.work);
  	struct intel_engine_cs *engine;
  	enum intel_engine_id id;
-	unsigned int hung = 0, stuck = 0, wedged = 0;
+	unsigned int hung = 0, stuck = 0, wedged = 0, watchdog = 0;
  
  	if (!i915_modparams.enable_hangcheck)
  		return;
@@ -261,6 +265,9 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
  	if (i915_terminally_wedged(&dev_priv->gpu_error))
  		return;
  
+	if (test_and_clear_bit(I915_RESET_WATCHDOG, &dev_priv->gpu_error.flags))
+		watchdog = 1;
+
  	/* As enabling the GPU requires fairly extensive mmio access,
  	 * periodically arm the mmio checker to see if we are triggering
  	 * any invalid access.
@@ -293,7 +300,7 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
  	}
  
  	if (hung)
-		hangcheck_declare_hang(dev_priv, hung, stuck);
+		hangcheck_declare_hang(dev_priv, hung, stuck, watchdog);
  
  	/* Reset timer in case GPU hangs without another request being added */
  	i915_queue_hangcheck(dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6c98fb7cebf2..e1dcdf545bee 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2027,6 +2027,70 @@ static int gen8_emit_flush_render(struct i915_request *request,
  	return 0;
  }
  
+/* From GEN9 onwards, all engines use the same RING_CNTR format */
+static inline u32 get_watchdog_disable(struct intel_engine_cs *engine)
+{
+	if (engine->id == RCS || INTEL_GEN(engine->i915) >= 9)
+		return GEN8_WATCHDOG_DISABLE;
+	else
+		return GEN8_XCS_WATCHDOG_DISABLE;
+}
+
+#define GEN8_WATCHDOG_1000US 0x2ee0 //XXX: Temp, replace with helper function

Please do then. :)

+static void gen8_watchdog_irq_handler(unsigned long data)
+{
+	struct intel_engine_cs *engine = (struct intel_engine_cs *)data;
+	struct drm_i915_private *dev_priv = engine->i915;
+	enum forcewake_domains fw_domains;
+	u32 current_seqno;
+
+	switch (engine->id) {
+	default:
+		MISSING_CASE(engine->id);
+		/* fall through */
+	case RCS:
+		fw_domains = FORCEWAKE_RENDER;
+		break;
+	case VCS:
+	case VCS2:
+	case VECS:
+		fw_domains = FORCEWAKE_MEDIA;
+		break;
+	}
+
+	intel_uncore_forcewake_get(dev_priv, fw_domains);

I'd be tempted to drop this and just use I915_WRITE. It doesn't feel 
like there is any performance to be gained with it and it embeds too 
much knowledge here.

Alternatively, if you want to keep it, consider using 
intel_uncore_forcewake_for_reg to leave the fw domain knowledge out of 
here. See for instance how it is used in intel_engine_cs.c.

+
+	/* Stop the counter to prevent further timeout interrupts */
+	I915_WRITE_FW(RING_CNTR(engine->mmio_base), get_watchdog_disable(engine));

What if we disable the watchdog for a batch following the falsely 
accused one? I mean this:

1. Batch 1 runs
2. IRQ fires -> tasklet_schedule
3. Batch 2 runs (can be different context)
4. Tasklet runs
5. Watchdog gets disabled
6. Batch 2 hangs - but watchdog has been disabled

?

+
+	current_seqno = intel_engine_get_seqno(engine);
+
+	/* did the request complete after the timer expired? */
+	if (intel_engine_last_submit(engine) == current_seqno)
+		goto fw_put;
+
+	if (engine->hangcheck.watchdog == current_seqno) {
+		/* Make sure the active request will be marked as guilty */
+		engine->hangcheck.stalled = true;
+		engine->hangcheck.acthd = intel_engine_get_active_head(engine);
+		engine->hangcheck.seqno = current_seqno;
+
+		/* And try to run the hangcheck_work as soon as possible */
+		set_bit(I915_RESET_WATCHDOG, &dev_priv->gpu_error.flags);
+		queue_delayed_work(system_long_wq,
+				   &dev_priv->gpu_error.hangcheck_work,
+				   round_jiffies_up_relative(HZ));
+	} else {
+		engine->hangcheck.watchdog = current_seqno;

The logic above potentially handles my previous question? Could be if 
batch 2 hangs. But..

+		/* Re-start the counter, if really hung, it will expire again */
+		I915_WRITE_FW(RING_THRESH(engine->mmio_base), GEN8_WATCHDOG_1000US);
+		I915_WRITE_FW(RING_CNTR(engine->mmio_base), GEN8_WATCHDOG_ENABLE);

.. the timeout will be wrong ie. not respected from what the userspace 
set. So I don't think it will work. This code either needs to handle 
running with watchdog enabled, or here it somehow needs to fish out the 
correct timeout to set.

+	}
+
+fw_put:
+	intel_uncore_forcewake_put(dev_priv, fw_domains);
+}
+
  /*
   * Reserve space for 2 NOOPs at the end of each request to be
   * used as a workaround for not being allowed to do lite
@@ -2115,6 +2179,9 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *engine)
  			     &engine->execlists.tasklet.state)))
  		tasklet_kill(&engine->execlists.tasklet);
  
+	if (WARN_ON(test_bit(TASKLET_STATE_SCHED, &engine->execlists.watchdog_tasklet.state)))
+		tasklet_kill(&engine->execlists.watchdog_tasklet);
+

I don't see any code ensuring this WARN can't fire if the tasklet gets 
delayed? A tasklet_kill in intel_engines_park might be enough.

  	dev_priv = engine->i915;
  
  	if (engine->buffer) {
@@ -2208,6 +2275,22 @@ logical_ring_default_irqs(struct intel_engine_cs *engine)
  
  	engine->irq_enable_mask = GT_RENDER_USER_INTERRUPT << shift;
  	engine->irq_keep_mask = GT_CONTEXT_SWITCH_INTERRUPT << shift;
+
+	switch (engine->id) {
+	default:
+		/* BCS engine does not support hw watchdog */
+		break;
+	case RCS:
+	case VCS:
+	case VCS2:

Change all to class based checks please or maintenance gets hard. Hm 
even more so, like this ICL is broken.

+		engine->irq_keep_mask |= (GT_GEN8_WATCHDOG_INTERRUPT << shift);

Braces not needed here and below.

+		break;
+	case VECS:
+		if (INTEL_GEN(engine->i915) >= 9)
+			engine->irq_keep_mask |=
+				(GT_GEN8_WATCHDOG_INTERRUPT << shift);
+		break;
+	}
  }
  
  static void
@@ -2221,6 +2304,9 @@ logical_ring_setup(struct intel_engine_cs *engine)
  	tasklet_init(&engine->execlists.tasklet,
  		     execlists_submission_tasklet, (unsigned long)engine);
  
+	tasklet_init(&engine->execlists.watchdog_tasklet,
+		     gen8_watchdog_irq_handler, (unsigned long)engine);
+
  	logical_ring_default_vfuncs(engine);
  	logical_ring_default_irqs(engine);
  }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 3c1366c58cf3..6cb8b4280035 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -120,6 +120,7 @@ struct intel_instdone {
  struct intel_engine_hangcheck {
  	u64 acthd;
  	u32 seqno;
+	u32 watchdog;
  	enum intel_engine_hangcheck_action action;
  	unsigned long action_timestamp;
  	int deadlock;
@@ -224,6 +225,11 @@ struct intel_engine_execlists {
  	 */
  	struct tasklet_struct tasklet;
  
+	/**
+	 * @watchdog_tasklet: stop counter and re-schedule hangcheck_work asap
+	 */
+	struct tasklet_struct watchdog_tasklet;
+
  	/**
  	 * @default_priolist: priority list for I915_PRIORITY_NORMAL
  	 */


Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx