[PATCH 5/5] [RFC]drm: add syncobj timeline support v3

christian.koenig@xxxxxxx (Christian König) · Mon, 3 Sep 2018 10:50:14 +0200

Am 03.09.2018 um 06:13 schrieb Chunming Zhou:
>
>
> å?¨ 2018/8/30 19:32, Christian KÃ¶nig å??é??:
>> [SNIP]
>>>>>>
>>>>>>> +
>>>>>>> +struct drm_syncobj_wait_pt {
>>>>>>> +Â Â Â  struct drm_syncobj_stub_fence base;
>>>>>>> +Â Â Â  u64Â Â Â  value;
>>>>>>> +Â Â Â  struct rb_nodeÂ Â  node;
>>>>>>> +};
>>>>>>> +struct drm_syncobj_signal_pt {
>>>>>>> +Â Â Â  struct drm_syncobj_stub_fence base;
>>>>>>> +Â Â Â  struct dma_fence *signal_fence;
>>>>>>> +Â Â Â  struct dma_fence *pre_pt_base;
>>>>>>> +Â Â Â  struct dma_fence_cb signal_cb;
>>>>>>> +Â Â Â  struct dma_fence_cb pre_pt_cb;
>>>>>>> +Â Â Â  struct drm_syncobj *syncobj;
>>>>>>> +Â Â Â  u64Â Â Â  value;
>>>>>>> +Â Â Â  struct list_head list;
>>>>>>> +};
>>>>>>
>>>>>> What are those two structures good for
>>>>> They are used to record wait ops points and signal ops points.
>>>>> For timeline, they are connected by timeline value, works like:
>>>>> Â Â Â  a. signal pt increase timeline value to signal_pt value when 
>>>>> signal_pt is signaled. signal_pt is depending on previous pt fence 
>>>>> and itself signal fence from signal ops.
>>>>> Â Â Â  b. wait pt tree checks if timeline value over itself value.
>>>>>
>>>>> For normal, works like:
>>>>> Â Â Â  a. signal pt increase 1 for syncobj_timeline->signal_point 
>>>>> every time when signal ops is performed.
>>>>> Â Â Â  b. when wait ops is comming, wait pt will fetch above last 
>>>>> signal pt value as its wait point. wait pt will be only signaled 
>>>>> by equal point value signal_pt.
>>>>>
>>>>>
>>>>>> and why is the stub fence their base?
>>>>> Good question, I tried to kzalloc for them as well when I debug 
>>>>> them, I encountered a problem:
>>>>> I lookup/find wait_pt or signal_pt successfully, but when I tried 
>>>>> to use them, they are freed sometimes, and results in NULL point.
>>>>> and generally, when lookup them, we often need their stub fence as 
>>>>> well, and in the meantime,Â  their lifecycle are same.
>>>>> Above reason, I make stub fence as their base.
>>>>
>>>> That sounds like you only did this because you messed up the 
>>>> lifecycle.
>>>>
>>>> Additional to that I don't think you correctly considered the 
>>>> lifecycle of the waits and the sync object itself. E.g. blocking in 
>>>> drm_syncobj_timeline_fini() until all waits are done is not a good 
>>>> idea.
>>>>
>>>> What you should do instead is to create a fence_array object with 
>>>> all the fence we need to wait for when a wait point is requested.
>>> Yeah, this is our initial discussion result, but when I tried to do 
>>> that, I found that cannot meet the advance signal requirement:
>>> Â Â Â  a. We need to consider the wait and signal pt value are not 
>>> one-to-one match, it's difficult to find dependent point, at least, 
>>> there is some overhead.
>>
>> As far as I can see that is independent of using a fence array here. 
>> See you can either use a ring buffer or an rb-tree, but when you want 
>> to wait for a specific point we need to condense the not yet signaled 
>> fences into an array.
> again, need to find the range of where the specific point in, we 
> should close to timeline semantics, I also refered the sw_sync.c 
> timeline, usally wait_pt is signalled by timeline point. And I agree 
> we can implement it with several methods, but I don't think there are 
> basical differences.

The problem is that with your current approach you need the sync_obj 
alive for the synchronization to work. That is most likely not a good idea.

Additional to that you enable signaling without a need from the waiting 
side. That is rather bad for implementations which need that optimization.

I suggest to either condense all the fences you need to wait for in an 
array during the wait operation, or reference count the sync_obj and 
only enable the signaling on the fences when requested by a wait.

>
>>
>>> Â Â Â  b. because we allowed "wait-before-signal", if 
>>> "wait-before-signal" happens, there isn't signal fence which can be 
>>> used to create fence array.
>>
>> Well, again we DON'T support wait-before-signal here. I will 
>> certainly NAK any implementation which tries to do this until we 
>> haven't figured out all the resource management constraints and I 
>> still don't see how we can do this.
> yes, we don't support real wait-before-signal in patch now, just a 
> fake wait-before-signal, which still wait on CS submission until 
> signal operation coming through wait_event, which is the conclusion we 
> disscussed before.

Well in this case we should call it wait-for-signal and not 
wait-before-signal :)

>
>>
>>> So timeline value is good to resolve that.
>>>
>>>>
>>>> Otherwise somebody can easily construct a situation where timeline 
>>>> sync obj A waits on B and B waits on A.
>>> Same situation also can happens with fence-array, we only can see 
>>> this is a programming bug with incorrect use.
>>
>> No, fence-array is initialized only once with a static list of 
>> fences. This way it is impossible to add the fence-array to itself 
>> for example.
>>
>> E.g. you can't build circle dependencies with that.
> we already wait for signal operation event, so never build circle 
> dependencies with that. The theory is same.

Yeah, ok that is certainly true.

Regards,
Christian.

>
>
>>
>>> Better use rbtree_postorder_for_each_entry_safe() for this.
>>>>> From the comments, seems we cannot use it here, we need erase node 
>>>>> here.
>>>>> Comments:
>>>>> Â * Note, however, that it cannot handle other modifications that 
>>>>> re-order the
>>>>> Â * rbtree it is iterating over. This includes calling rb_erase() 
>>>>> on @pos, as
>>>>> Â * rb_erase() may rebalance the tree, causing us to miss some nodes.
>>>>> Â */
>>>>
>>>> Yeah, but your current implementation has the same problem :)
>>>>
>>>> You need to use something like "while (node = rb_first(...))" 
>>>> instead if you want to destruct the whole tree.
> OK, got it, I will change it in v4.
>
> Thanks,
> David Zhou