Am 03.09.2018 um 06:13 schrieb Chunming Zhou: > > > å?¨ 2018/8/30 19:32, Christian König å??é??: >> [SNIP] >>>>>> >>>>>>> + >>>>>>> +struct drm_syncobj_wait_pt { >>>>>>> +   struct drm_syncobj_stub_fence base; >>>>>>> +   u64   value; >>>>>>> +   struct rb_node  node; >>>>>>> +}; >>>>>>> +struct drm_syncobj_signal_pt { >>>>>>> +   struct drm_syncobj_stub_fence base; >>>>>>> +   struct dma_fence *signal_fence; >>>>>>> +   struct dma_fence *pre_pt_base; >>>>>>> +   struct dma_fence_cb signal_cb; >>>>>>> +   struct dma_fence_cb pre_pt_cb; >>>>>>> +   struct drm_syncobj *syncobj; >>>>>>> +   u64   value; >>>>>>> +   struct list_head list; >>>>>>> +}; >>>>>> >>>>>> What are those two structures good for >>>>> They are used to record wait ops points and signal ops points. >>>>> For timeline, they are connected by timeline value, works like: >>>>>    a. signal pt increase timeline value to signal_pt value when >>>>> signal_pt is signaled. signal_pt is depending on previous pt fence >>>>> and itself signal fence from signal ops. >>>>>    b. wait pt tree checks if timeline value over itself value. >>>>> >>>>> For normal, works like: >>>>>    a. signal pt increase 1 for syncobj_timeline->signal_point >>>>> every time when signal ops is performed. >>>>>    b. when wait ops is comming, wait pt will fetch above last >>>>> signal pt value as its wait point. wait pt will be only signaled >>>>> by equal point value signal_pt. >>>>> >>>>> >>>>>> and why is the stub fence their base? >>>>> Good question, I tried to kzalloc for them as well when I debug >>>>> them, I encountered a problem: >>>>> I lookup/find wait_pt or signal_pt successfully, but when I tried >>>>> to use them, they are freed sometimes, and results in NULL point. >>>>> and generally, when lookup them, we often need their stub fence as >>>>> well, and in the meantime, their lifecycle are same. >>>>> Above reason, I make stub fence as their base. >>>> >>>> That sounds like you only did this because you messed up the >>>> lifecycle. >>>> >>>> Additional to that I don't think you correctly considered the >>>> lifecycle of the waits and the sync object itself. E.g. blocking in >>>> drm_syncobj_timeline_fini() until all waits are done is not a good >>>> idea. >>>> >>>> What you should do instead is to create a fence_array object with >>>> all the fence we need to wait for when a wait point is requested. >>> Yeah, this is our initial discussion result, but when I tried to do >>> that, I found that cannot meet the advance signal requirement: >>>    a. We need to consider the wait and signal pt value are not >>> one-to-one match, it's difficult to find dependent point, at least, >>> there is some overhead. >> >> As far as I can see that is independent of using a fence array here. >> See you can either use a ring buffer or an rb-tree, but when you want >> to wait for a specific point we need to condense the not yet signaled >> fences into an array. > again, need to find the range of where the specific point in, we > should close to timeline semantics, I also refered the sw_sync.c > timeline, usally wait_pt is signalled by timeline point. And I agree > we can implement it with several methods, but I don't think there are > basical differences. The problem is that with your current approach you need the sync_obj alive for the synchronization to work. That is most likely not a good idea. Additional to that you enable signaling without a need from the waiting side. That is rather bad for implementations which need that optimization. I suggest to either condense all the fences you need to wait for in an array during the wait operation, or reference count the sync_obj and only enable the signaling on the fences when requested by a wait. > >> >>>    b. because we allowed "wait-before-signal", if >>> "wait-before-signal" happens, there isn't signal fence which can be >>> used to create fence array. >> >> Well, again we DON'T support wait-before-signal here. I will >> certainly NAK any implementation which tries to do this until we >> haven't figured out all the resource management constraints and I >> still don't see how we can do this. > yes, we don't support real wait-before-signal in patch now, just a > fake wait-before-signal, which still wait on CS submission until > signal operation coming through wait_event, which is the conclusion we > disscussed before. Well in this case we should call it wait-for-signal and not wait-before-signal :) > >> >>> So timeline value is good to resolve that. >>> >>>> >>>> Otherwise somebody can easily construct a situation where timeline >>>> sync obj A waits on B and B waits on A. >>> Same situation also can happens with fence-array, we only can see >>> this is a programming bug with incorrect use. >> >> No, fence-array is initialized only once with a static list of >> fences. This way it is impossible to add the fence-array to itself >> for example. >> >> E.g. you can't build circle dependencies with that. > we already wait for signal operation event, so never build circle > dependencies with that. The theory is same. Yeah, ok that is certainly true. Regards, Christian. > > >> >>> Better use rbtree_postorder_for_each_entry_safe() for this. >>>>> From the comments, seems we cannot use it here, we need erase node >>>>> here. >>>>> Comments: >>>>>  * Note, however, that it cannot handle other modifications that >>>>> re-order the >>>>>  * rbtree it is iterating over. This includes calling rb_erase() >>>>> on @pos, as >>>>>  * rb_erase() may rebalance the tree, causing us to miss some nodes. >>>>>  */ >>>> >>>> Yeah, but your current implementation has the same problem :) >>>> >>>> You need to use something like "while (node = rb_first(...))" >>>> instead if you want to destruct the whole tree. > OK, got it, I will change it in v4. > > Thanks, > David Zhou