On 02/07/2015 12:54, Chris Wilson wrote:
On Thu, Jul 02, 2015 at 12:09:59PM +0100, John.C.Harrison@xxxxxxxxx wrote:
From: John Harrison <John.C.Harrison@xxxxxxxxx>
Various projects desire a mechanism for managing dependencies between
work items asynchronously. This can also include work items across
complete different and independent systems. For example, an
application wants to retreive a frame from a video in device,
using it for rendering on a GPU then send it to the video out device
for display all without having to stall waiting for completion along
the way. The sync framework allows this. It encapsulates
synchronisation events in file descriptors. The application can
request a sync point for the completion of each piece of work. Drivers
should also take sync points in with each new work request and not
schedule the work to start until the sync has been signalled.
This patch adds sync framework support to the exec buffer IOCTL. A
sync point can be passed in to stall execution of the batch buffer
until signalled. And a sync point can be returned after each batch
buffer submission which will be signalled upon that batch buffer's
completion.
At present, the input sync point is simply waited on synchronously
inside the exec buffer IOCTL call. Once the GPU scheduler arrives,
this will be handled asynchronously inside the scheduler and the IOCTL
can return without having to wait.
Note also that the scheduler will re-order the execution of batch
buffers, e.g. because a batch buffer is stalled on a sync point and
cannot be submitted yet but other, independent, batch buffers are
being presented to the driver. This means that the timeline within the
sync points returned cannot be global to the engine. Instead they must
be kept per context per engine (the scheduler may not re-order batches
within a context). Hence the timeline cannot be based on the existing
seqno values but must be a new implementation.
But there is nothing preventing assignment of the sync value on
submission. Other than the debug .fence_value_str it's a private
implementation detail, and the interface is solely through the fd and
signalling.
No, it needs to be public from the moment of creation. The sync
framework API allows sync points to be combined together to create
fences that either merge multiple points on the same timeline or
amalgamate points across differing timelines. The merging part means
that the sync point must be capable of doing arithmetic comparisons with
other sync points from the instant it is returned to user land. And
those comparisons must not change in the future due to scheduler
re-ordering because by then it is too late to redo the test.
You could implement this as a secondary write to the HWS,
assigning the sync_value to the sync_pt on submission and
remove the request tracking, as when signalled you only need to compare
the sync_value against the timeline value in the HWS.
However, that equally applies to the existing request->seqno. That can
also be assigned on submission so that it always an ordered timeline, and
so can be used internally or externally.
One of the scheduler patches is to defer seqno assignment until batch
submission rather than do it at request creation (for execbuffer
requests). You still have a problem with pre-emption though. A request
that is pre-empted will get a new seqno assigned when it is resubmitted
so that the HWS page always sees ordered values popping out. For
internal requests, this is fine but for external sync points that breaks
the assumptions made by the framework.
It's a pity that the sync_pt didn't forward the
fence->enable_signaling().
As for using rsvd2, make sure you do
int fence_fd = lower_32_bits(args->rsvd2);
and maybe I915_EXEC_CREATE_FENCE is a clearer name.
-Chris
Thanks,
John.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx