On Thu 26 Jan 2017, Chris Wilson wrote: > On Wed, Jan 25, 2017 at 12:38:32PM -0800, Chad Versace wrote: > > On Mon 14 Nov 2016, Chris Wilson wrote: > > > Userspace is faced with a dilemma. The kernel requires implicit fencing > > > to manage resource usage (we always must wait for the GPU to finish > > > before releasing its PTE) and for third parties. However, userspace may > > > wish to avoid this serialisation if it is either using explicit fencing > > > between parties and wants more fine-grained access to buffers (e.g. it > > > may partition the buffer between uses and track fences on ranges rather > > > than the implicit fences tracking the whole object). It follows that > > > userspace needs a mechanism to avoid the kernel's serialisation on its > > > implicit fences before execbuf execution. > > > > > > The next question is whether this is an object, execbuf or context flag. > > > Hybrid users (such as using explicit EGL_ANDROID_native_sync fencing on > > > shared winsys buffers, but implicit fencing on internal surfaces) > > > require a per-object level flag. Given that this flag need to be only > > > set once for the lifetime of the object, this reduces the convenience of > > > having an execbuf or context level flag (and avoids having multiple > > > pieces of uABI controlling the same feature). > > > > > > Incorrect use of this flag will result in rendering corruption and GPU > > > hangs - but will not result in use-after-free or similar resource > > > tracking issues. > > > > > > Serious caveat: write ordering is not strictly correct after setting > > > this flag on a render target on multiple engines. This affects all > > > subsequent GEM operations (execbuf, set-domain, pread) and shared > > > dma-buf operations. A fix is possible - but costly (both in terms of > > > further ABI changes and runtime overhead). > > > > > > Testcase: igt/gem_exec_async > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > > Reviewed-by: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> > > > --- > > > drivers/gpu/drm/i915/i915_drv.c | 1 + > > > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +++ > > > include/uapi/drm/i915_drm.h | 29 ++++++++++++++++++++++++++++- > > > 3 files changed, 32 insertions(+), 1 deletion(-) > > > > I'm neutral about this patch. I believe patch 14/14 is useful with or > > without this patch, and I want to see patch 14 land regardless of what > > happens with this one. > > I don't like the patch, it opens up a big wart in the GEM api (incorrect > write tracking on GEM/dma-buf across multiple timelines). Otoh, being > able to discard the implicit fence tracking seems to be an important > feature request - if we go forward witout it, we will then be lacking a > feature that is common across other drivers and in particular seems to > be commonplace in the Android ecosystem. I agree. The explicit fence fds provide more benefit (that is, less blocking and, in general, more *explicitness*) when implicit fencing is disabled. Userspace should have some API to disable the implicit fencing, and this patch seems like an ok approach. I certainly can think of nothing better. > Daniel, what's your feeling? One problem will be that the > synchronisation issue may be hard to track down in future (proving that > the cause of a stall is an avoidable implicit fence). > > > I'm not opposed to this patch. It's just that I don't yet understand > > exactly if Mesa's EGL/GL code could effectively use this feature for > > Android winsys buffers. The amount of information loss between the > > EGL/GL apis and the eventual execbuffer submission may prevent Mesa from > > annotating the Android winsys buffers with this. I'm unsure. I'm still > > thinking about it. > > > > But, if Chris, or anyone, already has plans to use this somehow, perhaps > > in the DDX, then don't let my hesitation block the patch. > > Actually, the example I have would be for mesa. It can use this on its > own scratch buffers to just discard writes and prevent ordering on > a single scratch shared between contexts, and for its fence tracking using > a single page for multiple rings. Those use cases sound good to me. This patch is Acked-by: Chad Versace <chadversary@xxxxxxxxxxxx> _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx