Am 03.09.2018 um 12:07 schrieb Chunming Zhou: > > > å?¨ 2018/9/3 16:50, Christian König å??é??: >> Am 03.09.2018 um 06:13 schrieb Chunming Zhou: >>> >>> >>> å?¨ 2018/8/30 19:32, Christian König å??é??: >>>> [SNIP] >>>>>>>> >>>>>>>>> + >>>>>>>>> +struct drm_syncobj_wait_pt { >>>>>>>>> +   struct drm_syncobj_stub_fence base; >>>>>>>>> +   u64   value; >>>>>>>>> +   struct rb_node  node; >>>>>>>>> +}; >>>>>>>>> +struct drm_syncobj_signal_pt { >>>>>>>>> +   struct drm_syncobj_stub_fence base; >>>>>>>>> +   struct dma_fence *signal_fence; >>>>>>>>> +   struct dma_fence *pre_pt_base; >>>>>>>>> +   struct dma_fence_cb signal_cb; >>>>>>>>> +   struct dma_fence_cb pre_pt_cb; >>>>>>>>> +   struct drm_syncobj *syncobj; >>>>>>>>> +   u64   value; >>>>>>>>> +   struct list_head list; >>>>>>>>> +}; >>>>>>>> >>>>>>>> What are those two structures good for >>>>>>> They are used to record wait ops points and signal ops points. >>>>>>> For timeline, they are connected by timeline value, works like: >>>>>>>    a. signal pt increase timeline value to signal_pt value when >>>>>>> signal_pt is signaled. signal_pt is depending on previous pt >>>>>>> fence and itself signal fence from signal ops. >>>>>>>    b. wait pt tree checks if timeline value over itself value. >>>>>>> >>>>>>> For normal, works like: >>>>>>>    a. signal pt increase 1 for syncobj_timeline->signal_point >>>>>>> every time when signal ops is performed. >>>>>>>    b. when wait ops is comming, wait pt will fetch above last >>>>>>> signal pt value as its wait point. wait pt will be only signaled >>>>>>> by equal point value signal_pt. >>>>>>> >>>>>>> >>>>>>>> and why is the stub fence their base? >>>>>>> Good question, I tried to kzalloc for them as well when I debug >>>>>>> them, I encountered a problem: >>>>>>> I lookup/find wait_pt or signal_pt successfully, but when I >>>>>>> tried to use them, they are freed sometimes, and results in NULL >>>>>>> point. >>>>>>> and generally, when lookup them, we often need their stub fence >>>>>>> as well, and in the meantime, their lifecycle are same. >>>>>>> Above reason, I make stub fence as their base. >>>>>> >>>>>> That sounds like you only did this because you messed up the >>>>>> lifecycle. >>>>>> >>>>>> Additional to that I don't think you correctly considered the >>>>>> lifecycle of the waits and the sync object itself. E.g. blocking >>>>>> in drm_syncobj_timeline_fini() until all waits are done is not a >>>>>> good idea. >>>>>> >>>>>> What you should do instead is to create a fence_array object with >>>>>> all the fence we need to wait for when a wait point is requested. >>>>> Yeah, this is our initial discussion result, but when I tried to >>>>> do that, I found that cannot meet the advance signal requirement: >>>>>    a. We need to consider the wait and signal pt value are not >>>>> one-to-one match, it's difficult to find dependent point, at >>>>> least, there is some overhead. >>>> >>>> As far as I can see that is independent of using a fence array >>>> here. See you can either use a ring buffer or an rb-tree, but when >>>> you want to wait for a specific point we need to condense the not >>>> yet signaled fences into an array. >>> again, need to find the range of where the specific point in, we >>> should close to timeline semantics, I also refered the sw_sync.c >>> timeline, usally wait_pt is signalled by timeline point. And I agree >>> we can implement it with several methods, but I don't think there >>> are basical differences. >> >> The problem is that with your current approach you need the sync_obj >> alive for the synchronization to work. That is most likely not a good >> idea. > Indeed, I will fix that. How abount only creating fence array for > every wait pt when syncobj release? when syncobj release, wait pt must > have waited the signal opertation, then we can easily condense fences > for every wait pt. And meantime, we can take timeline based wait pt > advantage. That could work, but I don't see how you want to replace the already issued fence with a fence_array when the sync object is destroyed. Additional to that I would rather prefer a consistent handling, e.g. without extra rarely used code paths. > >> >> Additional to that you enable signaling without a need from the >> waiting side. That is rather bad for implementations which need that >> optimization. > Do you mean increasing timeline based on signal fence is not better? > only update timeline value when requested by a wait pt? Yes, exactly. > This way, we will not update timeline value immidiately and cannot > free signal pt immidiately, and we also need to consider it to CPU > query and wait. That is actually the better coding style. We usually try to avoid doing things in interrupt handlers as much as possible. How about this idea: 1. Each signaling point is a fence implementation with an rb node. 2. Each node keeps a reference to the last previously inserted node. 3. Each node is referenced by the sync object itself. 4. Before each signal/wait operation we do a garbage collection and remove the first node from the tree as long as it is signaled. 5. When enable_signaling is requested for a node we cascade that to the left using rb_prev.    This ensures that signaling is enabled for the current fence as well as all previous fences. 6. A wait just looks into the tree for the signal point lower or equal of the requested sequence number. 7. When the sync object is released we use rbtree_postorder_for_each_entry_safe() and drop the extra reference to each node, but never call rb_erase!    This way the rb_tree stays in memory, but without a root (e.g. the sync object). It only destructs itself when the looked up references to the nodes are dropped. Well that is quite a bunch of logic, but I think that should work fine. Regards, Christian. > > Thanks, > David Zhou > >> >> I suggest to either condense all the fences you need to wait for in >> an array during the wait operation, or reference count the sync_obj >> and only enable the signaling on the fences when requested by a wait. >> >>> >>>> >>>>>    b. because we allowed "wait-before-signal", if >>>>> "wait-before-signal" happens, there isn't signal fence which can >>>>> be used to create fence array. >>>> >>>> Well, again we DON'T support wait-before-signal here. I will >>>> certainly NAK any implementation which tries to do this until we >>>> haven't figured out all the resource management constraints and I >>>> still don't see how we can do this. >>> yes, we don't support real wait-before-signal in patch now, just a >>> fake wait-before-signal, which still wait on CS submission until >>> signal operation coming through wait_event, which is the conclusion >>> we disscussed before. >> >> Well in this case we should call it wait-for-signal and not >> wait-before-signal :) >> >>> >>>> >>>>> So timeline value is good to resolve that. >>>>> >>>>>> >>>>>> Otherwise somebody can easily construct a situation where >>>>>> timeline sync obj A waits on B and B waits on A. >>>>> Same situation also can happens with fence-array, we only can see >>>>> this is a programming bug with incorrect use. >>>> >>>> No, fence-array is initialized only once with a static list of >>>> fences. This way it is impossible to add the fence-array to itself >>>> for example. >>>> >>>> E.g. you can't build circle dependencies with that. >>> we already wait for signal operation event, so never build circle >>> dependencies with that. The theory is same. >> >> Yeah, ok that is certainly true. >> >> Regards, >> Christian. >> >>> >>> >>>> >>>>> Better use rbtree_postorder_for_each_entry_safe() for this. >>>>>>> From the comments, seems we cannot use it here, we need erase >>>>>>> node here. >>>>>>> Comments: >>>>>>>  * Note, however, that it cannot handle other modifications that >>>>>>> re-order the >>>>>>>  * rbtree it is iterating over. This includes calling rb_erase() >>>>>>> on @pos, as >>>>>>>  * rb_erase() may rebalance the tree, causing us to miss some >>>>>>> nodes. >>>>>>>  */ >>>>>> >>>>>> Yeah, but your current implementation has the same problem :) >>>>>> >>>>>> You need to use something like "while (node = rb_first(...))" >>>>>> instead if you want to destruct the whole tree. >>> OK, got it, I will change it in v4. >>> >>> Thanks, >>> David Zhou >> > > _______________________________________________ > amd-gfx mailing list > amd-gfx at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx