Ian Lance Taylor wrote: > Mike Sharov <msharov@xxxxxxxxxxxxxxxxxxxxx> writes: >> I'm trying to create some template specializations to combine short movs >> into larger movs. > > The compiler should normally do this for you automatically when optimizing. Your faith in the compiler is touching, but unfounded. It doesn't do it. For example: class point { public: point (void); point (int16_t nx, int16_t ny) : x (nx), y (ny) {} public: int16_t x; int16_t y; }; point::point (void) : x (0), y (0) { } void Assign (point& dest, const point& src) { dest = src; } Generates the following assembly (Athlon64, -O3,-march=athlon64) And no, -ftree-vectorize does not change the result. _ZN5pointC2Ev: movw $0, (%rdi) movw $0, 2(%rdi) ret _ZN5pointC1Ev: movw $0, (%rdi) movw $0, 2(%rdi) ret _Z6AssignR5pointRKS_: movzwl (%rsi), %edx movzwl 2(%rsi), %eax movw %dx, (%rdi) movw %ax, 2(%rdi) ret The first two blocks are the default constructors. The Assign contains the inlined implicit operator=. This is how the compiler always assigns to member variables. If you have sixteen uint8_t vars in your class, then, by God, you'll get 32 movb instructions in your operator=, even though a pair of movups would have done. I entirely agree that the compiler should be able to combine the movs, but it does not, and I have no choice but to implement the sort of hacks I was asking about. The question stands: how can the compiler be informed of aliasing, or how to disable reordering for a particular block of code? -- Mike msharov@xxxxxxxxxxxxxxxxxxxxx