Re: [PATCH] drm/i915: Before pageflip, also wait for shared dmabuf fences.

Christian König <deathsimple@xxxxxxxxxxx> · Thu, 22 Sep 2016 15:22:42 +0200

Am 22.09.2016 um 15:05 schrieb Daniel Vetter:
On Thu, Sep 22, 2016 at 2:44 PM, Christian König
<deathsimple@xxxxxxxxxxx> wrote:
- explicit fencing: Userspace passes around distinct fence objects for
any work going on on the gpu. The kernel doesn't insert any stall of
it's own (except for moving buffer objects around ofc). This is what
Android. This also seems to be what amdgpu is doing within one
process/owner.

No, that is clearly not my understanding of explicit fencing.

Userspace doesn't necessarily need to pass around distinct fence objects
with all of it's protocols and the kernel is still responsible for inserting
stalls whenever an userspace protocol or application requires this
semantics.

Otherwise you will never be able to use explicit fencing on the Linux
desktop with protocols like DRI2/DRI3.
This is about mixing them. Explicit fencing still means userspace has
an explicit piece, separate from buffers, (either sync_file fd, or a
driver-specific cookie, or similar).

I would expect that every driver in the system waits for all fences of a
reservation object as long as it isn't told otherwise by providing a
distinct fence object with the IOCTL in question.
Yup agreed. This way if your explicitly-fencing driver reads a shared
buffer passed over a protocol that does implicit fencing (like
DRI2/3), then it will work.

The other interop direction is explicitly-fencing driver passes a
buffer to a consumer which expects implicit fencing. In that case you
must attach the right fence to the exclusive slot, but _only_ in that
case.

Ok well sounds like you are close to understand why I can't do exactly 
this: There simply is no right fence I could attach.

When amdgpu makes the command submissions it doesn't necessarily know 
that the buffer will be exported and shared with another device later on.

So when the buffer is exported and given to the other device you might 
have a whole bunch of fences which run concurrently and not in any 
serial order.

Otherwise you end up stalling your explicitly-fencing userspace,
since implicit fencing doesn't allow more than 1 writer. For amdgpu
one possible way to implement this might be to count how many users a
dma-buf has, and if it's more than just the current context set the
exclusive fence. Or do an uabi revision and let userspace decide (or
at least overwrite it).

I mean I can pick one fence and wait for the rest to finish manually, 
but that would certainly defeat the whole effort, doesn't it?

I completely agree that you have only 1 writer with implicit fencing, 
but when you switch from explicit fencing back to implicit fencing you 
can have multiple ones.

But the current approach in amdgpu_sync.c of declaring a fence as
exclusive after the fact (if owners don't match) just isn't how
reservation_object works. You can of course change that, but that
means you must change all drivers implementing support for implicit
fencing of dma-buf. Fixing amdgpu will be easier ;-)

Well as far as I can see there is no way I can fix amdgpu in this case.

The handling clearly needs to be changed on the receiving side of the 
reservation objects if I don't completely want to disable concurrent 
access to BOs in amdgpu.

Regards,
Christian.

-Daniel

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel