On 22/09/16 10:22 PM, Christian König wrote: > Am 22.09.2016 um 15:05 schrieb Daniel Vetter: >> On Thu, Sep 22, 2016 at 2:44 PM, Christian König >> <deathsimple@xxxxxxxxxxx> wrote: >>>> - explicit fencing: Userspace passes around distinct fence objects for >>>> any work going on on the gpu. The kernel doesn't insert any stall of >>>> it's own (except for moving buffer objects around ofc). This is what >>>> Android. This also seems to be what amdgpu is doing within one >>>> process/owner. >>> >>> No, that is clearly not my understanding of explicit fencing. >>> >>> Userspace doesn't necessarily need to pass around distinct fence objects >>> with all of it's protocols and the kernel is still responsible for >>> inserting >>> stalls whenever an userspace protocol or application requires this >>> semantics. >>> >>> Otherwise you will never be able to use explicit fencing on the Linux >>> desktop with protocols like DRI2/DRI3. >> This is about mixing them. Explicit fencing still means userspace has >> an explicit piece, separate from buffers, (either sync_file fd, or a >> driver-specific cookie, or similar). >> >>> I would expect that every driver in the system waits for all fences of a >>> reservation object as long as it isn't told otherwise by providing a >>> distinct fence object with the IOCTL in question. >> Yup agreed. This way if your explicitly-fencing driver reads a shared >> buffer passed over a protocol that does implicit fencing (like >> DRI2/3), then it will work. >> >> The other interop direction is explicitly-fencing driver passes a >> buffer to a consumer which expects implicit fencing. In that case you >> must attach the right fence to the exclusive slot, but _only_ in that >> case. > > Ok well sounds like you are close to understand why I can't do exactly > this: There simply is no right fence I could attach. > > When amdgpu makes the command submissions it doesn't necessarily know > that the buffer will be exported and shared with another device later on. > > So when the buffer is exported and given to the other device you might > have a whole bunch of fences which run concurrently and not in any > serial order. I feel like you're thinking too much of buffers shared between GPUs as being short-lived and only shared late. In the use-cases I know about, shared buffers are created separately and shared ahead of time, the actual rendering work is done to non-shared buffers and then just copied to the shared buffers for transfer between GPUs. These copies are always performed by the same context in such a way that they should always be performed by the same HW engine and thus implicitly serialized. Do you have any specific use-cases in mind where buffers are only shared between GPUs after the rendering operations creating the buffer contents to be shared have already been submitted? >> Otherwise you end up stalling your explicitly-fencing userspace, >> since implicit fencing doesn't allow more than 1 writer. For amdgpu >> one possible way to implement this might be to count how many users a >> dma-buf has, and if it's more than just the current context set the >> exclusive fence. Or do an uabi revision and let userspace decide (or >> at least overwrite it). > > I mean I can pick one fence and wait for the rest to finish manually, > but that would certainly defeat the whole effort, doesn't it? I'm afraid it's not clear to me why it would. Can you elaborate? >> But the current approach in amdgpu_sync.c of declaring a fence as >> exclusive after the fact (if owners don't match) just isn't how >> reservation_object works. You can of course change that, but that >> means you must change all drivers implementing support for implicit >> fencing of dma-buf. Fixing amdgpu will be easier ;-) > > Well as far as I can see there is no way I can fix amdgpu in this case. > > The handling clearly needs to be changed on the receiving side of the > reservation objects if I don't completely want to disable concurrent > access to BOs in amdgpu. Anyway, we need a solution for this between radeon and amdgpu, and I don't think a solution which involves those drivers using reservation object semantics between them which are different from all other drivers is a good idea. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel