On 21/04/16 13:05, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> i915_gem_obj_to_vma is one of the most expensive functions in our profiles. Could avoiding some branching by replacing it with arithmetic be beneficial? Some benchmarks suggest it slightly might. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> --- drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 0549dea683e1..243bfb922eb3 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4642,11 +4642,21 @@ struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj, struct i915_address_space *vm) { struct i915_vma *vma; + + BUILD_BUG_ON(I915_GGTT_VIEW_NORMAL != 0); + list_for_each_entry(vma, &obj->vma_list, obj_link) { - if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL && - vma->vm == vm) + /* + * Below is just a branching avoiding way of saying: + * vma_ggtt_view.type == I915_GGTT_VIEW_NORMAL && vma->vm == vm, + * which relies on the fact I915_GGTT_VIEW_NORMAL has to be + * zero. + */ + if (!((unsigned long)vma->ggtt_view.type | + ((unsigned long)vma->vm ^ (unsigned long)vm))) return vma; } + return NULL; }
Other alternatives might include splitting the vma_list, so that we have one list for the most-frequently searched-for entries (GGTT view NORMAL) and for everything else, so the above would just need a single test for equality.
Or, slightly less effectively, add GGTT/NORMAL entries at the head of the list and others at the tail (and search backwards if you *don't* want a GGTT/NORMAL entry). That would still need the comparisons, but would likely hit an early match.
.Dave. _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx