Am 09.06.21 um 15:42 schrieb Daniel Vetter:
[SNIP]
That won't work. The problem is that you have only one exclusive slot, but
multiple submissions which execute out of order and compose the buffer
object together.
That's why I suggested to use the dma_fence_chain to circumvent this.
But if you are ok that amdgpu sets the exclusive fence without changing the
shared ones than the solution I've outlined should already work as well.
Uh that's indeed nasty. Can you give me the details of the exact use-case
so I can read the userspace code and come up with an idea? I was assuming
that even with parallel processing there's at least one step at the end
that unifies it for the next process.
Unfortunately not, with Vulkan that is really in the hand of the
application.
But the example we have in the test cases is using 3D+DMA to compose a
buffer IIRC.
If we can't detect this somehow then it means we do indeed have to create
a fence_chain for the exclusive slot for everything, which would be nasty.
I've already created a prototype of that and it is not that bad. It does
have some noticeable overhead, but I think that's ok.
Or a large-scale redo across all drivers, which is probaly even more
nasty.
Yeah, that is indeed harder to get right.
Christian.
-Daniel