On Mon, Jul 10, 2017 at 02:09:42PM -0700, Jason Ekstrand wrote: > On Mon, Jul 10, 2017 at 9:15 AM, Christian König <deathsimple@xxxxxxxxxxx> > wrote: > > > Am 10.07.2017 um 17:52 schrieb Jason Ekstrand: > > > > On Mon, Jul 10, 2017 at 8:45 AM, Christian König <deathsimple@xxxxxxxxxxx> > > wrote: > > > >> Am 10.07.2017 um 17:28 schrieb Jason Ekstrand: > >> > >> On Wed, Jul 5, 2017 at 6:04 PM, Dave Airlie <airlied@xxxxxxxxx> wrote: > >> [SNIP] > >> So, reading some CTS tests again, and I think we have a problem here. > >> The Vulkan spec allows you to wait on a fence that is in the unsignaled > >> state. > >> > >> > >> At least on the closed source driver that would be illegal as far as I > >> know. > >> > > > > Then they are doing workarounds in userspace. There are definitely CTS > > tests for this: > > > > https://github.com/KhronosGroup/VK-GL-CTS/blob/master/external/vulkancts/ > > modules/vulkan/synchronization/vktSynchronizationBasicFenceTests.cpp#L74 > > > > > >> You can't wait on a semaphore before the signal operation is send down to > >> the kerel. > >> > > > > We (Intel) deal with this today by tracking whether or not the fence has > > been submitted and using a condition variable in userspace to sort it all > > out. > > > > > > Which sounds exactly like what AMD is doing in it's drivers as well. > > > > Which doesn't work cross-process so... > > > If we ever want to share fences across processes (which we do), then this > > needs to be sorted in the kernel. > > > > > > That would clearly get a NAK from my side, even Microsoft forbids wait > > before signal because you can easily end up in deadlock situations. > > > > Please don't NAK things that are required by the API specification and CTS > tests. That makes it very hard for people like me to get their jobs done. > :-) > > Now, as for whether or not it's a good idea. First off, we do have > timeouts an a status querying mechanism so an application can just set a > timeout of 1s and do something if it times out. Second, if the application > is a compositor or something else that doesn't trust its client, it > shouldn't be using the OPAQUE_FD mechanism of Vulkan semaphore/fence > sharing anyway. For those scenarios, they can require the untrusted client > to use FENCE_FD (sync file) and they have all of the usual guarantees about > when the work got submitted, etc. > > Also, I'm more than happy to put this all behind a flag so it's not the > default behavior. Android had a similar requirement to have a fence fd before the fence existed in hwc1, before they fixed that in hwc2. But it's probably still useful for deeply pipelined renderes with littel memory, aka tiled renderers on phones. The idea we've tossed around is to create a so-called future fence. In the kernel if you try to deref a future fence, the usual thing that happens is you'll block (interruptibly, which we can because fence lookup might fail), _until_ a real fence shows up and can be returned. That implements the uapi expectations without risking deadlocks in the kernel, albeit with a bit much blocking. Still better than doing the same in userspace (since in userspace you probably need to do that when importing the fence, not at execbuf time). 2nd step would then be to give drivers with a robust hand recover logic a special interface to be able to instantiate hw waits on such a future fence before the signalling part is queued up. As long as any waiters have robust hang recovery we still don't have a problem. Everyone else (e.g. drm display-only drivers, v4l nodes, camera ip, whatever else participates in the shared buffers and fences stuff) would still block until the real fence shows up. Similar idea should work for semaphores too. Gustavo did look into the future fence stuff iirc, I think there was even an rfc sometimes ago. Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel