Re: [PATCH v4 18/38] drm/i915: Added scheduler support to __wait_request() calls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/01/2016 23:14, Chris Wilson wrote:
On Mon, Jan 11, 2016 at 06:42:47PM +0000, John.C.Harrison@xxxxxxxxx wrote:
From: John Harrison <John.C.Harrison@xxxxxxxxx>

The scheduler can cause batch buffers, and hence requests, to be
submitted to the ring out of order and asynchronously to their
submission to the driver. Thus at the point of waiting for the
completion of a given request, it is not even guaranteed that the
request has actually been sent to the hardware yet. Even it is has
been sent, it is possible that it could be pre-empted and thus
'unsent'.

This means that it is necessary to be able to submit requests to the
hardware during the wait call itself. Unfortunately, while some
callers of __wait_request() release the mutex lock first, others do
not (and apparently can not). Hence there is the ability to deadlock
as the wait stalls for submission but the asynchronous submission is
stalled for the mutex lock.
That is a nonsequitor. Do you mean to say that unless we take action
inside GEM, the request will never be submitted to hardware by the
scheduler?

Potentially. The scheduler holds on to batches inside its internal queue for later submission. It can also preempt batches that have already been sent to the hardware. Thus the wait call might be waiting on a batch with has or has not been submitted but even it is currently executing, it might get kicked out and need re-submitting. That submission requires calling the back end submission part of the driver - legacy ring buffer, execlist or GuC. Those back ends all require grabbing the mutex lock.

This change hooks the scheduler in to the __wait_request() code to
ensure correct behaviour. That is, flush the target batch buffer
through to the hardware and do not deadlock waiting for something that
cannot currently be submitted.
The dependencies are known during request construction, how could we
generate a cyclic graph?
It is not so much a cyclic graph as a bouncing one. As noted above, even a batch buffer which is currently executing could get preempted and need to be resubmitted again while someone is waiting one it.

  The scheduler itself does not need the
struct_mutex (other than the buggy code),
The scheduler internally does not need the mutex lock at all - it has its own private spinlock for critical data. However, the back end submission paths do currently require the mutex lock.

  so GEM holding the
struct_mutex will not prevent the scheduler from eventually submitting
the request we are waiting for. So as far as I can see, you are papering
over your own bugs.
-Chris


_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux