On Tue, Dec 13, 2016 at 12:22:18PM +0000, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > A few details to hopefully make a very hot function a tiny bit > more efficient: > > 1. Cast VM pointers before substraction to save the compiler > doing a smart one which includes multiplication. Indeed. Not pretty though. static always_inline __kernel_ptrdiff_t ptrdiff(const void *a, const void *b) { return a - b; } cmp = ptrdiff(vma->vm, vm); if (cmp) return cmp; > 2. Use smaller type for comparison since we only care about > the sign. Should be a no-op since the compiler also should only care about the sign and not be moving the registers about, just the cc and we should be inlining... Is gcc not smart enough? :( > > 3. Prefer the ppgtt lookup branch and inline it, allowing the > compiler to optimise out the second part of i915_vma_compare > and save one call indirection. This runs counter to a better optimisation that completely avoids calling obj_to_vma for ppgtt lookups (i.e. in execbuffer we go straight from handle to vma, skipping the handle to obj intermediate lookup). Primary caller for this function should be ggtt users, with single negative lookups before creating the ppgtt vma. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx