On Fri, May 20, 2016 at 01:31:30AM +0100, Dave Gordon wrote: > Signed-off-by: Dave Gordon <david.s.gordon@xxxxxxxxx> > Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> Much better. The effect of the inline gem:exec:fault:1MiB: +4.90% gem:exec:fault:1MiB:forked: +7.99% gem:exec:fault:16MiB: +22.94% gem:exec:fault:16MiB:forked: +19.96% gem:exec:fault:256MiB: +27.45% gem:exec:fault:256MiB:forked: +36.89% And it brings this series into parity with mine. ---- Avoiding the out-of-line call to sg_next() reduces the kernel execution overhead by 10% in some workloads (for example the Unreal Engine 4 demo Atlantis on 2GiB GTTs) which are dominated by the cost of inserting PTEs due to texture thrashing. We can demonstrate this in a microbenchmark that forces us to rebind the object on every execbuf, where we can measure a 25% improvement, in the time required to execute an execbuf requiring a texture to be rebound, for inlining the sg_next() for large texture sizes. Benchmark: igt/benchmarks/gem_exec_fault Benchmark: igt/benchmarks/gem_exec_trace/Atlantis -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx