On 4/28/20 6:18 PM, Chris Wilson wrote:
Quoting Nirmoy Das (2020-04-28 17:04:23)
Userspace can severely fragment rb_hole_addr rbtree by manipulating
alignment while allocating buffers. Fragmented rb_hole_addr rbtree would
result in large delays while allocating buffer object for a userspace
application. It takes long time to find suitable hole because if we fail
to find a suitable hole in the first attempt then we look for neighbouring
nodes using rb_prev(). Traversing rbtree using rb_prev() can take really
long time if the tree is fragmented.
This patch improves searches in fragmented rb_hole_addr rbtree by storing
an extra field in drm_mm_node, max_hole_size. Each drm_mm_node now stores
maximum hole size for its subtree in drm_mm_node->max_hole_size. Using
drm_mm_node->max_hole_size, it is possible to eliminate complete
subtree if that subtree is unable to serve a request hence reducing number
of rb_prev() used.
Note: Implementation details of rb_hole_addr rbtree updates after a node
removal and addition is borrowed from block/bfq-wf2q.c which is trying to
achieve a similar goal.
Feels like it is an augmented rbtree?
I just read about augmented rbtree quickly and yes it feels like so. I
will try to understand more
about it and will see if I can convert this logic to a augmented rbtree.
+static struct drm_mm_node *
+next_hole_high_addr(struct drm_mm_node *entry, u64 size)
+{
+ struct rb_node *rb_node, *left_rb_node, *parent_rb_node;
+ struct drm_mm_node *left_node;
+
+ if (!entry)
+ return false;
+
+ rb_node = &entry->rb_hole_addr;
+ if (rb_node->rb_left) {
+ left_rb_node = rb_node->rb_left;
+ parent_rb_node = rb_parent(rb_node);
+ left_node = rb_entry(left_rb_node,
+ struct drm_mm_node, rb_hole_addr);
+ if ((left_node->max_hole_size < size ||
+ entry->size == entry->max_hole_size) &&
+ parent_rb_node && parent_rb_node->rb_left != rb_node)
+ return rb_hole_addr_to_node(parent_rb_node);
+ }
+
+ return rb_hole_addr_to_node(rb_prev(rb_node));
+}
The max_hole_size would equally apply to next_hole_low_addr(), right?
Sorry, yes we need that too. I will add next_hole_low_addr().
Sadly, I did not explore the fragmented search LOW/HIGH in test-drm_mm.c.
That would be a useful addition as well.
Yes that would be useful too.
I used ./tools/testing/selftests/drivers/gpu/drm_mm.sh to verify this patch.
-Chris
Thanks for reviewing,
Nirmoy
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel