Re: [RFC 01/11] drm/i915: Early exit from semaphore_waits_for for execlist mode.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 16/06/2015 17:50, Chris Wilson wrote:
On Tue, Jun 16, 2015 at 04:46:05PM +0100, Tomas Elf wrote:
On 16/06/2015 14:44, Daniel Vetter wrote:
On Mon, Jun 08, 2015 at 06:03:19PM +0100, Tomas Elf wrote:
When submitting semaphores in execlist mode the hang checker crashes in this
function because it is only runnable in ring submission mode. The reason this
is of particular interest to the TDR patch series is because we use semaphores
as a mean to induce hangs during testing (which is the recommended way to
induce hangs for gen8+). It's not clear how this is supposed to work in
execlist mode since:

1. This function requires a ring buffer.

2. Retrieving a ring buffer in execlist mode requires us to retrieve the
corresponding context, which we get from a request.

3. Retieving a request from the hang checker is not straight-forward since that
requires us to grab the struct_mutex in order to synchronize against the
request retirement thread.

4. Grabbing the struct_mutex from the hang checker is nothing that we will do
since that puts us at risk of deadlock since a hung thread might be holding the
struct_mutex already.

Therefore it's not obvious how we're supposed to deal with this. For now, we're
doing an early exit from this function, which avoids any kernel panic situation
when running our own internal TDR ULT.

Signed-off-by: Tomas Elf <tomas.elf@xxxxxxxxx>

We should have a Testcase: line here which mentions the igt testcase which
provoke this bug. Or we need to fill this gap asap.
-Daniel

You know this better than I do: Is there an IGT test that submits a
semaphore in execlist mode? Because that's all you need to do to
reproduce this. We could certainly add one if there is none like
that already.

No, we don't have anything submitting a hanging semaphore from
userspace or igt specifically.
-Chris


At first I thought that it would be ok to just submit any semaphore but I guess it would have to be a hanging semaphore specifically. Or at least a semaphore that does not progress ACTHD from one hang check period to the following (seeing as we check for ACTHD progression in ring_stuck() and then call semaphore_passed() that calls semaphore_waits_for() if ACTHD hasn't moved).

Fine, we'll have to add that then.

Thanks,
Tomas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux