Hello all! I am attempting to use GCC's extended inline assembly to explicitly use floating point vector operations on POWER8, but the generated code does not correctly load the VSX registers. It could well be that I am misunderstanding the use of the "wd" asm constraint. Sample code: #include <cassert> typedef double v2d __attribute((vector_size(16))); __attribute((noinline)) void vadd(v2d &x,v2d y) { x+=y; } __attribute((noinline)) void vadd_asm(v2d &x,v2d y) { asm ("xvadddp %0,%0,%1\n\t": "+wd"(x): "wd"(y)); } int main() { v2d a={1.,2.}; v2d b={10.,11.}; v2d c={0.,0.}; c=a; vadd(c,b); assert(c[0]==11.f); assert(c[1]==13.f); c=a; vadd_asm(c,b); assert(c[0]==11.f); assert(c[1]==13.f); } When this code is compiled with g++ 4.9.1, with options -O3 -mcpu=power8 -mvsx, the second pair of assertions will fail. The generated assembly: vadd(v2d &x,v2d y): lxvd2x 0,0,3 xxpermdi 0,0,0,2 xvadddp 34,0,34 xxpermdi 34,34,34,2 stxvd2x 34,0,3 blr vadd_asm(v2d &x,v2d y): lxvd2x 0,0,3 xxpermdi 0,0,0,2 xvadddp 0,0,2 xxpermdi 0,0,0,2 stxvd2x 0,0,3 blr I am surmising that the ABI dictates that the vector parameter y be passed in VR[2] (equivalently VSR[34]). In vadd_asm, it can be seen that VSR[2] is used instead. What would be the correct constraint to use in the asm statement in order to generate correct code? And why is "wd" inappropriate? Best regards, Sam