On Tue, Jun 16, 2015 at 04:46:05PM +0100, Tomas Elf wrote: > On 16/06/2015 14:44, Daniel Vetter wrote: > >On Mon, Jun 08, 2015 at 06:03:19PM +0100, Tomas Elf wrote: > >>When submitting semaphores in execlist mode the hang checker crashes in this > >>function because it is only runnable in ring submission mode. The reason this > >>is of particular interest to the TDR patch series is because we use semaphores > >>as a mean to induce hangs during testing (which is the recommended way to > >>induce hangs for gen8+). It's not clear how this is supposed to work in > >>execlist mode since: > >> > >>1. This function requires a ring buffer. > >> > >>2. Retrieving a ring buffer in execlist mode requires us to retrieve the > >>corresponding context, which we get from a request. > >> > >>3. Retieving a request from the hang checker is not straight-forward since that > >>requires us to grab the struct_mutex in order to synchronize against the > >>request retirement thread. > >> > >>4. Grabbing the struct_mutex from the hang checker is nothing that we will do > >>since that puts us at risk of deadlock since a hung thread might be holding the > >>struct_mutex already. > >> > >>Therefore it's not obvious how we're supposed to deal with this. For now, we're > >>doing an early exit from this function, which avoids any kernel panic situation > >>when running our own internal TDR ULT. > >> > >>Signed-off-by: Tomas Elf <tomas.elf@xxxxxxxxx> > > > >We should have a Testcase: line here which mentions the igt testcase which > >provoke this bug. Or we need to fill this gap asap. > >-Daniel > > You know this better than I do: Is there an IGT test that submits a > semaphore in execlist mode? Because that's all you need to do to reproduce > this. We could certainly add one if there is none like that already. Not that I know of, so needs to be added. But it's a security fix and should go to -fixes+cc:stable since any kind of userspace could Oops the kernel with this. -Daniel > > Thanks, > Tomas > > > > >>--- > >> drivers/gpu/drm/i915/i915_irq.c | 20 ++++++++++++++++++++ > >> 1 file changed, 20 insertions(+) > >> > >>diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > >>index 46bcbff..40c44fc 100644 > >>--- a/drivers/gpu/drm/i915/i915_irq.c > >>+++ b/drivers/gpu/drm/i915/i915_irq.c > >>@@ -2698,6 +2698,26 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno) > >> u64 offset = 0; > >> int i, backwards; > >> > >>+ /* > >>+ * This function does not support execlist mode - any attempt to > >>+ * proceed further into this function will result in a kernel panic > >>+ * when dereferencing ring->buffer, which is not set up in execlist > >>+ * mode. > >>+ * > >>+ * The correct way of doing it would be to derive the currently > >>+ * executing ring buffer from the current context, which is derived > >>+ * from the currently running request. Unfortunately, to get the > >>+ * current request we would have to grab the struct_mutex before doing > >>+ * anything else, which would be ill-advised since some other thread > >>+ * might have grabbed it already and managed to hang itself, causing > >>+ * the hang checker to deadlock. > >>+ * > >>+ * Therefore, this function does not support execlist mode in its > >>+ * current form. Just return NULL and move on. > >>+ */ > >>+ if (i915.enable_execlists) > >>+ return NULL; > >>+ > >> ipehr = I915_READ(RING_IPEHR(ring->mmio_base)); > >> if (!ipehr_is_semaphore_wait(ring->dev, ipehr)) > >> return NULL; > >>-- > >>1.7.9.5 > >> > >>_______________________________________________ > >>Intel-gfx mailing list > >>Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > >>http://lists.freedesktop.org/mailman/listinfo/intel-gfx > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx