Re: [PATCH v6 00/34] GPU scheduler for i915 driver

Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> · Mon, 25 Apr 2016 10:54:27 +0100

On Wed, Apr 20, 2016 at 06:13:18PM +0100, John.C.Harrison@xxxxxxxxx wrote:
> From: John Harrison <John.C.Harrison@xxxxxxxxx>
> 
> Implemented a batch buffer submission scheduler for the i915 DRM driver.
> 
> The general theory of operation is that when batch buffers are
> submitted to the driver, the execbuffer() code assigns a unique seqno
> value and then packages up all the information required to execute the
> batch buffer at a later time. This package is given over to the
> scheduler which adds it to an internal node list.

That is called the request. Please use it properly. We pass a request to
the engine. The engine can hook up the scheduler in which case it may
defer execution of that request until the scheduler says it is ready.
But that is a conversation inside the engine, not intermingled with GEM.

GEM needs to only care about its unique seqno along the timeline. We can
leave space for the scheduler to inject a second seqno if that is
convenient, but since the scheduler can keep a global list of inflight
requests, its retirement is also globally ordered even if we only
inspect each request wrt to its own timeline.

> The scheduler also
> scans the list of objects associated with the batch buffer and
> compares them against the objects already in use by other buffers in
> the node list.

This part moves the policy of the GEM API and enshrines it into the
scheduler. Let GEM convert the implicit rules it has into explicit
dependencies, on to which we can add the explicit dependencies passed in
by the user. GEM has the information it needs to maintain its API, let's
not let that silliness spread.

> If matches are found then the new batch buffer node is
> marked as being dependent upon the matching node. The same is done for
> the context object. The scheduler also bumps up the priority of such
> matching nodes on the grounds that the more dependencies a given batch
> buffer has the more important it is likely to be.

Sounds like a rough cut at PI avoidance. This is easily abusable. It
sounds like you want a prio/nice split. Is that a good idea to introduce
in the first series?

> The scheduler aims to have a given (tuneable) number of batch buffers
> in flight on the hardware at any given time. If fewer than this are
> currently executing when a new node is queued, then the node is passed
> straight through to the submit function. Otherwise it is simply added
> to the queue and the driver returns back to user land.

Keep it in the scheduler, I see you decided to add EAGAIN to GEM execbuf
when currently execbuf *blocks* in this situation. Don't use EAGAIN here
since you just make userspace busyspin over a very, very heavyweight
ioctl, and you have the power here to throttle userspace without
blocking.

> As each batch buffer completes, it raises an interrupt which wakes up
> the scheduler. Note that it is possible for multiple buffers to
> complete before the IRQ handler gets to run. Further, the seqno values
> of the individual buffers are not necessary incrementing as the
> scheduler may have re-ordered their submission. However, the scheduler
> keeps the list of executing buffers in order of hardware submission.
> Thus it can scan through the list until a matching seqno is found and
> then mark all in flight nodes from that point on as completed.

No. You haven't built your timelines correctly. The idea behind the
timeline is that a request can wait upon its seqno and check it against
its own timeline, which is ordered only with other requests on its
timeline. Because of this, it is independent of whatever order the
scheduler executes the timelines in, each timeline is ordered.

A request can simply wait for its timeline to advance, completely
ignorant of the scheduler. (Request signaling may be driven by the
scheduler, but that is a lowlevel, not GEM or dma-buf/fence,
implementation detail. And only if the scheduler is coupled into the
user-interupt, but on execlists it will be using the context-switch
interrupt to driver itself, and for ringbuffer mode we have a choice of
user-interrupt or using pipe-control/dw-notify to keep the paths
separate.)

> A deferred work queue is also poked by the interrupt handler.

Wait you did all that in the interrupt handler. NO.

> If a suitable node is found then it is sent to execbuff_final() for
> submission to the hardware.

That is an absolutely atrocious implementation. Don't pass it back up
the layers when you already have all the information you need packaged in
the request to submit it to hardware.

execbuf -> engine->add_request -> [scheduler] -> engine->submit_request

(conceptually, since what should happen is execlists gets it
context-switch interrupt and then asks what set of requests to submit
next. In a ringbuffer mode, a kthread/ktasklet would run off the
interrupt ask which request to execute next and do the write into the
ring for the switch into the new context and execution of the next batch,
and then write the RING_TAIL.)

At the fundamental level it looks like you have not introduced timelines
correctly or introduced the scheduler as a separate entity for deciding
which request to execute next (or if this request should preempt execution).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx