Shared semaphores for amdgpu

david1.zhou@xxxxxxx (zhoucm1) · Thu, 9 Mar 2017 17:43:02 +0800



On 2017å¹´03æ??09æ?¥ 17:12, Christian KÃ¶nig wrote:
> Am 09.03.2017 um 09:15 schrieb Dave Airlie:
>> On 9 March 2017 at 17:38, Christian KÃ¶nig <christian.koenig at amd.com> 
>> wrote:
>>>> I do wonder if we need the separate sem signal/wait interface, I think
>>>> we should just add
>>>> semaphore chunks to the CS interface.
>>> Yeah, that's what I've said as well from the very first beginning.
>>>
>>> Another question is if we should really create another 
>>> implementation to
>>> share semaphores between processes.
>>>
>>> In other words putting the current fences inside the semaphore into a
>>> sync_file with the signal_on_any bit set would have pretty much the 
>>> same
>>> effect, except that the resulting object then had the sync_file 
>>> semantics
>>> for adding new fences and can be used in the atomic IOCTLs as well.
>> So the vulkan external semaphore spec has two different type of 
>> semaphore
>> semantics, I'm not sure the sync_file semantics match the first type,
>> only the second.
>
> I haven't completely read that part of the spec yet, but from what I 
> know the first semantics is actually a bit scary and I'm not sure if 
> we want to fully support that.
>
> Especially that you can wait on a semaphore object which is not 
> signaled yet can easily lead to deadlocks and bound resources in the 
> kernel and windowing system.
>
> Imagine that you send a command submission to the kernel with the 
> request to wait for a semaphore object and then never signal that 
> semaphore object. At least for amdgpu the kernel driver would accept 
> that CS and push it into the scheduler. This operation needs memory, 
> so by doing this the application would bind kernel memory without the 
> prospect of releasing it anytime soon.
>
> We could of course try to limit the amounts of waiting CS in the 
> kernel, but then we have the problem of deadlocks again. E.g. the 
> signaling CS wouldn't be accepted by the kernel because we have so 
> many waiters.
>
> Additional to that you can easily build deadlocks in the form CS A 
> depends on CS B and CS B depends on CS A. The exact same problem for 
> Android fences where discussed on the list as well, but the semantic 
> there is especially designed so that you can't build deadlocks with it.
Forbidding to wait un-sginaled sem will be enough for your this concern.

>
>> I think we would still need separate objects to do the first type,
Agreed, the implementation indeed do this.
>> which I want for VR stuff..
>
> Which is perfectly reasonable, sharing the object between processes 
> takes time. So you only want to do this once.
>
> As a possible solution what do you think about adding some new 
> functionality to the sync file IOCTLs?
>
> IIRC we currently only support adding new fences to the sync file and 
> then waiting for all of the in the CS/Atomic page flip.
>
> But what if we also allow replacing the fence(s) in the sync file? And 
> then additional to that consuming the fence in the CS/Atomic page flip 
> IOCTL?
I feel the new sem implementation of what I attached is very good, at 
least a good start, maybe we could discuss from there, not talk in the 
fly, so that any problem we can improve it.

Regards,
David Zhou
>
> That's trivial to implement and should give us pretty much the same 
> semantics as the shared semaphore object in Vulkan.
>
> Christian.
>
>>
>> I'll try and think about it a bit harder tomorrow.
>>
>> Dave.
>
>