[RFC 3/3] drm/i915: Micro-optimize i915_gem_obj_to_vma

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

i915_gem_obj_to_vma is one of the most expensive functions in
our profiles. Could avoiding some branching by replacing it
with arithmetic be beneficial? Some benchmarks suggest it
slightly might.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
---
 drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0549dea683e1..243bfb922eb3 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4642,11 +4642,21 @@ struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
 				     struct i915_address_space *vm)
 {
 	struct i915_vma *vma;
+
+	BUILD_BUG_ON(I915_GGTT_VIEW_NORMAL != 0);
+
 	list_for_each_entry(vma, &obj->vma_list, obj_link) {
-		if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL &&
-		    vma->vm == vm)
+		/*
+		 * Below is just a branching avoiding way of saying:
+		 * vma_ggtt_view.type == I915_GGTT_VIEW_NORMAL && vma->vm == vm,
+		 * which relies on the fact I915_GGTT_VIEW_NORMAL has to be
+		 * zero.
+		 */
+		if (!((unsigned long)vma->ggtt_view.type |
+		    ((unsigned long)vma->vm ^ (unsigned long)vm)))
 			return vma;
 	}
+
 	return NULL;
 }
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux