On Thu, Jul 02, 2015 at 02:01:56PM +0100, John Harrison wrote: > On 02/07/2015 12:54, Chris Wilson wrote: > >On Thu, Jul 02, 2015 at 12:09:59PM +0100, John.C.Harrison@xxxxxxxxx wrote: > >>From: John Harrison <John.C.Harrison@xxxxxxxxx> > >> > >>Various projects desire a mechanism for managing dependencies between > >>work items asynchronously. This can also include work items across > >>complete different and independent systems. For example, an > >>application wants to retreive a frame from a video in device, > >>using it for rendering on a GPU then send it to the video out device > >>for display all without having to stall waiting for completion along > >>the way. The sync framework allows this. It encapsulates > >>synchronisation events in file descriptors. The application can > >>request a sync point for the completion of each piece of work. Drivers > >>should also take sync points in with each new work request and not > >>schedule the work to start until the sync has been signalled. > >> > >>This patch adds sync framework support to the exec buffer IOCTL. A > >>sync point can be passed in to stall execution of the batch buffer > >>until signalled. And a sync point can be returned after each batch > >>buffer submission which will be signalled upon that batch buffer's > >>completion. > >> > >>At present, the input sync point is simply waited on synchronously > >>inside the exec buffer IOCTL call. Once the GPU scheduler arrives, > >>this will be handled asynchronously inside the scheduler and the IOCTL > >>can return without having to wait. > >> > >>Note also that the scheduler will re-order the execution of batch > >>buffers, e.g. because a batch buffer is stalled on a sync point and > >>cannot be submitted yet but other, independent, batch buffers are > >>being presented to the driver. This means that the timeline within the > >>sync points returned cannot be global to the engine. Instead they must > >>be kept per context per engine (the scheduler may not re-order batches > >>within a context). Hence the timeline cannot be based on the existing > >>seqno values but must be a new implementation. > >But there is nothing preventing assignment of the sync value on > >submission. Other than the debug .fence_value_str it's a private > >implementation detail, and the interface is solely through the fd and > >signalling. > No, it needs to be public from the moment of creation. The sync > framework API allows sync points to be combined together to create > fences that either merge multiple points on the same timeline or > amalgamate points across differing timelines. The merging part means > that the sync point must be capable of doing arithmetic comparisons > with other sync points from the instant it is returned to user land. > And those comparisons must not change in the future due to scheduler > re-ordering because by then it is too late to redo the test. You know that's not documented at all. The only information userspace gets is afaict struct sync_pt_info { __u32 len; char obj_name[32]; char driver_name[32]; __s32 status; __u64 timestamp_ns; __u8 driver_data[0]; }; There is a merge operation done by combining two fence into a new one. Merging is done by ordering the fences based on the context pointers and then by sync_pt->fence.seqno, not the private sync value. How does userspace try to order the fences other than as opaque fd? You actually mean driver_data is undefined ABI... > > You could implement this as a secondary write to the HWS, > >assigning the sync_value to the sync_pt on submission and > >remove the request tracking, as when signalled you only need to compare > >the sync_value against the timeline value in the HWS. > > > >However, that equally applies to the existing request->seqno. That can > >also be assigned on submission so that it always an ordered timeline, and > >so can be used internally or externally. > > One of the scheduler patches is to defer seqno assignment until > batch submission rather than do it at request creation (for > execbuffer requests). You still have a problem with pre-emption > though. A request that is pre-empted will get a new seqno assigned > when it is resubmitted so that the HWS page always sees ordered > values popping out. For internal requests, this is fine but for > external sync points that breaks the assumptions made by the > framework. I fail to see how. Nothing in uapi/sync.h says anything about the order of fences or gives any such guarantees. If the external callers only have access through the fd, there is no restriction that the timeline sync_pt->value must be set prior to submission. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx