Problem with using VSX register constrains in extended asm

Sam Yates <sam.yates@xxxxxxx> · Sat, 18 Apr 2015 00:22:12 +0200

Hello all!

I am attempting to use GCC's extended inline assembly to explicitly
use floating point vector operations on POWER8, but the generated code
does not correctly load the VSX registers.

It could well be that I am misunderstanding the use of the "wd" asm constraint.

Sample code:

#include <cassert>

typedef double v2d __attribute((vector_size(16)));

__attribute((noinline)) void vadd(v2d &x,v2d y) {
    x+=y;
}

__attribute((noinline)) void vadd_asm(v2d &x,v2d y) {
    asm ("xvadddp %0,%0,%1\n\t": "+wd"(x): "wd"(y));
}

int main() {
    v2d a={1.,2.};
    v2d b={10.,11.};
    v2d c={0.,0.};

    c=a;
    vadd(c,b);
    assert(c[0]==11.f);
    assert(c[1]==13.f);

    c=a;
    vadd_asm(c,b);
    assert(c[0]==11.f);
    assert(c[1]==13.f);
}

When this code is compiled with g++ 4.9.1, with options -O3
-mcpu=power8 -mvsx, the second pair of assertions will fail.

The generated assembly:
vadd(v2d &x,v2d y):

        lxvd2x 0,0,3
        xxpermdi 0,0,0,2
        xvadddp 34,0,34
        xxpermdi 34,34,34,2
        stxvd2x 34,0,3
        blr

vadd_asm(v2d &x,v2d y):

        lxvd2x 0,0,3
        xxpermdi 0,0,0,2
        xvadddp 0,0,2
        xxpermdi 0,0,0,2
        stxvd2x 0,0,3
        blr

I am surmising that the ABI dictates that the vector parameter y be
passed in VR[2] (equivalently VSR[34]). In vadd_asm, it can be seen
that VSR[2] is used instead.

What would be the correct constraint to use in the asm statement in
order to generate correct code? And why is "wd" inappropriate?

Best regards,
Sam