On Tue, Apr 26, 2016 at 11:35:53AM +0100, Dave Gordon wrote: > On 21/04/16 13:05, Tvrtko Ursulin wrote: > >From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > > >i915_gem_obj_to_vma is one of the most expensive functions in > >our profiles. Could avoiding some branching by replacing it > >with arithmetic be beneficial? Some benchmarks suggest it > >slightly might. > > > >Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > >--- > > drivers/gpu/drm/i915/i915_gem.c | 14 ++++++++++++-- > > 1 file changed, 12 insertions(+), 2 deletions(-) > > > >diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > >index 0549dea683e1..243bfb922eb3 100644 > >--- a/drivers/gpu/drm/i915/i915_gem.c > >+++ b/drivers/gpu/drm/i915/i915_gem.c > >@@ -4642,11 +4642,21 @@ struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj, > > struct i915_address_space *vm) > > { > > struct i915_vma *vma; > >+ > >+ BUILD_BUG_ON(I915_GGTT_VIEW_NORMAL != 0); > >+ > > list_for_each_entry(vma, &obj->vma_list, obj_link) { > >- if (vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL && > >- vma->vm == vm) > >+ /* > >+ * Below is just a branching avoiding way of saying: > >+ * vma_ggtt_view.type == I915_GGTT_VIEW_NORMAL && vma->vm == vm, > >+ * which relies on the fact I915_GGTT_VIEW_NORMAL has to be > >+ * zero. > >+ */ > >+ if (!((unsigned long)vma->ggtt_view.type | > >+ ((unsigned long)vma->vm ^ (unsigned long)vm))) > > return vma; > > } > >+ > > return NULL; > > } > > Other alternatives might include splitting the vma_list, so that we > have one list for the most-frequently searched-for entries (GGTT > view NORMAL) and for everything else, so the above would just need a > single test for equality. > > Or, slightly less effectively, add GGTT/NORMAL entries at the head > of the list and others at the tail (and search backwards if you > *don't* want a GGTT/NORMAL entry). That would still need the > comparisons, but would likely hit an early match. We want one list for convenience elsewhere, but can keep a rht in parallel. This is not as effective/important as keeping a hashtable to translate from handle to vma, but is still useful for some stress cases. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx