On 22/03/16 17:39, Tvrtko Ursulin wrote:
On 22/03/16 17:29, Ville Syrjälä wrote:
On Tue, Mar 22, 2016 at 05:16:52PM +0000, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Since we write four times to the same register, caching
the mmio register saves a tiny amount of generated code.
The compiler can't figure this out on its own?
Nope, at least gcc 4.84 I am running here can't. :(
And this only solves one part of the things it can't figure out in that
code. It still recalculates one part, can't remember which one is which
now without revisiting the generated assembly. It used to be for times
in a row: load register, add 0x230, displace 0x78, store[0-4]. This only
solves the add 0x230 redundancy. But working around that would possibly
be a bit too low level.
Regards,
Tvrtko
Compiler's probably assuming aliasing.
RING_ELSP(engine) is actually (engine->mmio_base+0x230).
I915_WRITE_FW(reg, val) is actually __raw_i915_write32(dev_priv,
(reg__), (val__)) which ultimately translates to a store to some address.
The compiler can't be sure that this store isn't actually to
(engine->mmio_base), so it refetches it and adds the 0x230 again. Saving
the (struct-valued) result of the RING_ELSP() macro means the compiler
knows it isn't aliased, so can reuse it four times.
We could try adding __restrict to various key pointers, starting with
dev_priv and all pointers-to-engines?
.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx