Quoting Sean Paul (2017-06-29 19:10:11) > On Thu, Jun 29, 2017 at 01:59:30PM +0100, Chris Wilson wrote: > > Reduce the list iteration when incrementing the timeline by storing the > > fences in increasing order. > > > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Sumit Semwal <sumit.semwal@xxxxxxxxxx> > > Cc: Sean Paul <seanpaul@xxxxxxxxxxxx> > > Cc: Gustavo Padovan <gustavo@xxxxxxxxxxx> > > --- > > drivers/dma-buf/sw_sync.c | 49 ++++++++++++++++++++++++++++++++++++++------ > > drivers/dma-buf/sync_debug.h | 5 +++++ > > 2 files changed, 48 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c > > index e51fe11bbbea..8cef5d345316 100644 > > --- a/drivers/dma-buf/sw_sync.c > > +++ b/drivers/dma-buf/sw_sync.c > > @@ -96,6 +96,7 @@ static struct sync_timeline *sync_timeline_create(const char *name) > > obj->context = dma_fence_context_alloc(1); > > strlcpy(obj->name, name, sizeof(obj->name)); > > > > + obj->pt_tree = RB_ROOT; > > INIT_LIST_HEAD(&obj->pt_list); > > spin_lock_init(&obj->lock); > > > > @@ -142,9 +143,13 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc) > > > > obj->value += inc; > > > > - list_for_each_entry_safe(pt, next, &obj->pt_list, link) > > - if (dma_fence_is_signaled_locked(&pt->base)) > > - list_del_init(&pt->link); > > + list_for_each_entry_safe(pt, next, &obj->pt_list, link) { > > + if (!dma_fence_is_signaled_locked(&pt->base)) > > + break; > > + > > + list_del_init(&pt->link); > > + rb_erase(&pt->node, &obj->pt_tree); > > + } > > > > spin_unlock_irq(&obj->lock); > > } > > @@ -174,8 +179,38 @@ static struct sync_pt *sync_pt_create(struct sync_timeline *obj, > > INIT_LIST_HEAD(&pt->link); > > > > spin_lock_irq(&obj->lock); > > - if (!dma_fence_is_signaled_locked(&pt->base)) > > - list_add_tail(&pt->link, &obj->pt_list); > > + if (!dma_fence_is_signaled_locked(&pt->base)) { > > + struct rb_node **p = &obj->pt_tree.rb_node; > > + struct rb_node *parent = NULL; > > + > > + while (*p) { > > + struct sync_pt *other; > > + int cmp; > > + > > + parent = *p; > > + other = rb_entry(parent, typeof(*pt), node); > > + cmp = value - other->base.seqno; > > We're imposing an implicit limitation on userspace here that points cannot be > > INT_MAX apart (since they'll be in the wrong order in the tree). That's not a new limitation. It's an artifact of using the u32 timeline/seqno. > I wonder how much this patch will actually save, given that the number of active points > on a given timeline should be small. Do we have any evidence that this > optimization is warranted? The good news is that for empty/small trees, the overhead is tiny, less than the cost of the syscall. I honestly don't know who uses sw_sync and so would benefit. I figured I would throw it out here since it was trivial. -Chris _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel