Re: [PATCH 5/5] [RFC]drm: add syncobj timeline support v3

Christian König <christian.koenig@xxxxxxx> · Mon, 3 Sep 2018 10:50:14 +0200

Am 03.09.2018 um 06:13 schrieb Chunming Zhou:

在 2018/8/30 19:32, Christian König 写道:
[SNIP]

+
+struct drm_syncobj_wait_pt {
+    struct drm_syncobj_stub_fence base;
+    u64    value;
+    struct rb_node   node;
+};
+struct drm_syncobj_signal_pt {
+    struct drm_syncobj_stub_fence base;
+    struct dma_fence *signal_fence;
+    struct dma_fence *pre_pt_base;
+    struct dma_fence_cb signal_cb;
+    struct dma_fence_cb pre_pt_cb;
+    struct drm_syncobj *syncobj;
+    u64    value;
+    struct list_head list;
+};

What are those two structures good for
They are used to record wait ops points and signal ops points.
For timeline, they are connected by timeline value, works like:
    a. signal pt increase timeline value to signal_pt value when 
signal_pt is signaled. signal_pt is depending on previous pt fence 
and itself signal fence from signal ops.
    b. wait pt tree checks if timeline value over itself value.

For normal, works like:
    a. signal pt increase 1 for syncobj_timeline->signal_point 
every time when signal ops is performed.
    b. when wait ops is comming, wait pt will fetch above last 
signal pt value as its wait point. wait pt will be only signaled 
by equal point value signal_pt.

and why is the stub fence their base?
Good question, I tried to kzalloc for them as well when I debug 
them, I encountered a problem:
I lookup/find wait_pt or signal_pt successfully, but when I tried 
to use them, they are freed sometimes, and results in NULL point.
and generally, when lookup them, we often need their stub fence as 
well, and in the meantime,  their lifecycle are same.
Above reason, I make stub fence as their base.

That sounds like you only did this because you messed up the 
lifecycle.

Additional to that I don't think you correctly considered the 
lifecycle of the waits and the sync object itself. E.g. blocking in 
drm_syncobj_timeline_fini() until all waits are done is not a good 
idea.

What you should do instead is to create a fence_array object with 
all the fence we need to wait for when a wait point is requested.
Yeah, this is our initial discussion result, but when I tried to do 
that, I found that cannot meet the advance signal requirement:
    a. We need to consider the wait and signal pt value are not 
one-to-one match, it's difficult to find dependent point, at least, 
there is some overhead.

As far as I can see that is independent of using a fence array here. 
See you can either use a ring buffer or an rb-tree, but when you want 
to wait for a specific point we need to condense the not yet signaled 
fences into an array.
again, need to find the range of where the specific point in, we 
should close to timeline semantics, I also refered the sw_sync.c 
timeline, usally wait_pt is signalled by timeline point. And I agree 
we can implement it with several methods, but I don't think there are 
basical differences.

The problem is that with your current approach you need the sync_obj 
alive for the synchronization to work. That is most likely not a good idea.

Additional to that you enable signaling without a need from the waiting 
side. That is rather bad for implementations which need that optimization.

I suggest to either condense all the fences you need to wait for in an 
array during the wait operation, or reference count the sync_obj and 
only enable the signaling on the fences when requested by a wait.

    b. because we allowed "wait-before-signal", if 
"wait-before-signal" happens, there isn't signal fence which can be 
used to create fence array.

Well, again we DON'T support wait-before-signal here. I will 
certainly NAK any implementation which tries to do this until we 
haven't figured out all the resource management constraints and I 
still don't see how we can do this.
yes, we don't support real wait-before-signal in patch now, just a 
fake wait-before-signal, which still wait on CS submission until 
signal operation coming through wait_event, which is the conclusion we 
disscussed before.

Well in this case we should call it wait-for-signal and not 
wait-before-signal :)

So timeline value is good to resolve that.

Otherwise somebody can easily construct a situation where timeline 
sync obj A waits on B and B waits on A.
Same situation also can happens with fence-array, we only can see 
this is a programming bug with incorrect use.

No, fence-array is initialized only once with a static list of 
fences. This way it is impossible to add the fence-array to itself 
for example.

E.g. you can't build circle dependencies with that.
we already wait for signal operation event, so never build circle 
dependencies with that. The theory is same.

Yeah, ok that is certainly true.

Regards,
Christian.

Better use rbtree_postorder_for_each_entry_safe() for this.
From the comments, seems we cannot use it here, we need erase node 
here.
Comments:
 * Note, however, that it cannot handle other modifications that 
re-order the
 * rbtree it is iterating over. This includes calling rb_erase() 
on @pos, as
 * rb_erase() may rebalance the tree, causing us to miss some nodes.
 */

Yeah, but your current implementation has the same problem :)

You need to use something like "while (node = rb_first(...))" 
instead if you want to destruct the whole tree.
OK, got it, I will change it in v4.

Thanks,
David Zhou

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel