On Mon, Jun 08, 2015 at 06:03:19PM +0100, Tomas Elf wrote: > When submitting semaphores in execlist mode the hang checker crashes in this > function because it is only runnable in ring submission mode. The reason this > is of particular interest to the TDR patch series is because we use semaphores > as a mean to induce hangs during testing (which is the recommended way to > induce hangs for gen8+). It's not clear how this is supposed to work in > execlist mode since: > > 1. This function requires a ring buffer. > > 2. Retrieving a ring buffer in execlist mode requires us to retrieve the > corresponding context, which we get from a request. > > 3. Retieving a request from the hang checker is not straight-forward since that > requires us to grab the struct_mutex in order to synchronize against the > request retirement thread. > > 4. Grabbing the struct_mutex from the hang checker is nothing that we will do > since that puts us at risk of deadlock since a hung thread might be holding the > struct_mutex already. > > Therefore it's not obvious how we're supposed to deal with this. For now, we're > doing an early exit from this function, which avoids any kernel panic situation > when running our own internal TDR ULT. > > Signed-off-by: Tomas Elf <tomas.elf@xxxxxxxxx> We should have a Testcase: line here which mentions the igt testcase which provoke this bug. Or we need to fill this gap asap. -Daniel > --- > drivers/gpu/drm/i915/i915_irq.c | 20 ++++++++++++++++++++ > 1 file changed, 20 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index 46bcbff..40c44fc 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -2698,6 +2698,26 @@ semaphore_waits_for(struct intel_engine_cs *ring, u32 *seqno) > u64 offset = 0; > int i, backwards; > > + /* > + * This function does not support execlist mode - any attempt to > + * proceed further into this function will result in a kernel panic > + * when dereferencing ring->buffer, which is not set up in execlist > + * mode. > + * > + * The correct way of doing it would be to derive the currently > + * executing ring buffer from the current context, which is derived > + * from the currently running request. Unfortunately, to get the > + * current request we would have to grab the struct_mutex before doing > + * anything else, which would be ill-advised since some other thread > + * might have grabbed it already and managed to hang itself, causing > + * the hang checker to deadlock. > + * > + * Therefore, this function does not support execlist mode in its > + * current form. Just return NULL and move on. > + */ > + if (i915.enable_execlists) > + return NULL; > + > ipehr = I915_READ(RING_IPEHR(ring->mmio_base)); > if (!ipehr_is_semaphore_wait(ring->dev, ipehr)) > return NULL; > -- > 1.7.9.5 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx