On Fri, Jan 23, 2015 at 6:30 PM, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote: > On Fri, Jan 23, 2015 at 04:53:48PM +0100, Daniel Vetter wrote: >> Yeah that's kind the big behaviour difference (at least as I see it) >> between explicit sync and implicit sync: >> - with implicit sync the kernel attachs sync points/requests to buffers >> and userspace just asks about idle/business of buffers. Synchronization >> between different users is all handled behind userspace's back in the >> kernel. >> >> - explicit sync attaches sync points to individual bits of work and makes >> them explicit objects userspace can get at and pass around. Userspace >> uses these separate things to inquire about when something is >> done/idle/busy and has its own mapping between explicit sync objects and >> the different pieces of memory affected by each. Synchronization between >> different clients is handled explicitly by passing sync objects around >> each time some rendering is done. >> >> The bigger driver for explicit sync (besides "nvidia likes it sooooo much >> that everyone uses it a lot") seems to be a) shitty gpu drivers without >> proper bo managers (*cough*android*cough*) and svm, where there's simply >> no buffer objects any more to attach sync information to. > > Actually, mesa would really like much finer granularity than at batch > boundaries. Having a sync object for a batch boundary itself is very meh > and not a substantive improvement on what it possible today, but being > able to convert the implicit sync into an explicit fence object is > interesting and lends a layer of abstraction that could make it more > versatile. Most importantly, it allows me to defer the overhead of fence > creation until I actually want to sleep on a completion. Also Jesse > originally supporting inserting fences inside a batch, which looked > interesting if impractical. If want to allow the kernel to stall on fences (in e.g. the scheduler) only the kernel should be allowed to create fences imo. At least current fences assume that they _will_ signal eventually, and for i915 fences we have the hangcheck to ensure this is the case. In-batch fences and lazy fence creation (beyond just delaying the fd allocation to avoid too many fds flying around) is therefore a no-go. For that kind of fine-grained sync between gpu and cpu workloads the solutions thus far (at least what I've seen) is just busy-looping. Usually those workloads have a few order more sync pionts than frames we tend to render, so blocking isn't terrible efficient anyway. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx