On 16/06/2015 14:44, Daniel Vetter wrote:
On Mon, Jun 08, 2015 at 06:03:19PM +0100, Tomas Elf wrote:
When submitting semaphores in execlist mode the hang checker crashes in this
function because it is only runnable in ring submission mode. The reason this
is of particular interest to the TDR patch series is because we use semaphores
as a mean to induce hangs during testing (which is the recommended way to
induce hangs for gen8+). It's not clear how this is supposed to work in
execlist mode since:
1. This function requires a ring buffer.
2. Retrieving a ring buffer in execlist mode requires us to retrieve the
corresponding context, which we get from a request.
3. Retieving a request from the hang checker is not straight-forward since that
requires us to grab the struct_mutex in order to synchronize against the
request retirement thread.
4. Grabbing the struct_mutex from the hang checker is nothing that we will do
since that puts us at risk of deadlock since a hung thread might be holding the
struct_mutex already.
Therefore it's not obvious how we're supposed to deal with this. For now, we're
doing an early exit from this function, which avoids any kernel panic situation
when running our own internal TDR ULT.
Signed-off-by: Tomas Elf <tomas.elf@xxxxxxxxx>
We should have a Testcase: line here which mentions the igt testcase which
provoke this bug. Or we need to fill this gap asap.
-Daniel
You know this better than I do: Is there an IGT test that submits a
semaphore in execlist mode? Because that's all you need to do to
reproduce this. We could certainly add one if there is none like that
already.
Thanks,
Tomas
---
drivers/gpu/drm/i915/i915_irq.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 46bcbff..40c44fc 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2698,6 +2698,26 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno)
u64 offset = 0;
int i, backwards;
+ /*
+ * This function does not support execlist mode - any attempt to
+ * proceed further into this function will result in a kernel panic
+ * when dereferencing ring->buffer, which is not set up in execlist
+ * mode.
+ *
+ * The correct way of doing it would be to derive the currently
+ * executing ring buffer from the current context, which is derived
+ * from the currently running request. Unfortunately, to get the
+ * current request we would have to grab the struct_mutex before doing
+ * anything else, which would be ill-advised since some other thread
+ * might have grabbed it already and managed to hang itself, causing
+ * the hang checker to deadlock.
+ *
+ * Therefore, this function does not support execlist mode in its
+ * current form. Just return NULL and move on.
+ */
+ if (i915.enable_execlists)
+ return NULL;
+
ipehr = I915_READ(RING_IPEHR(ring->mmio_base));
if (!ipehr_is_semaphore_wait(ring->dev, ipehr))
return NULL;
--
1.7.9.5
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx