Odd error with the "X" inline assembly constraint

David Brown via Gcc-help <gcc-help@xxxxxxxxxxx> · Fri, 5 Jan 2024 17:30:51 +0100

I was testing a little by looking at some code generation for 
optimisations and re-arrangements when -ffast-math is enabled, using 
code like this in <https://godbolt.org> :

typedef float T;
T test(T a, T b) {
    T x = a + b;
    //asm ("" :: "X" (x));
    asm ("" : "+X" (x));
    //asm ("" : "+X" (x) : "X" (x));
    return x - b;
}

Without the asm statements, gcc - as expected - skips the calculation of 
"x" and can then simplify "a + b - b" to "a".  I have previously used 
inline assembly of the form:

	asm ("" : "+g" (x));

to tell gcc "You need to calculate x before running this assembly and 
put it in a general register or memory, but it might change during the 
assembly so you must forget anything you knew about it before".  I've 
found it useful to force particular orders on calculations, or for 
debugging, or as a kind of fine-tuned alternative to a memory barrier.

But the "+g" operand is not ideal for floating point variables - it 
forces the compiler to move the variable from a floating point register 
into a general-purpose register, then back again.  The ideal choice 
seems to be "+X", since "X" matches any operand whatsoever.

However, when I use just "asm ("" : "+X" (x));", I get an error message 
"error: inconsistent operand constraints in an 'asm'".  I have no idea 
why this is an issue.

Getting weirder, on x86-64, there is no error if I use

	asm ("" : "+X" (x) : "X" (x));

This gives me the desired effect of forcing "x" to be calculated and 
used in the final "x - b".

Even weirder, on 32-bit ARM, this still gives the inconsistent operand 
error.

Weirder still, this works error-free on both targets :

    asm ("" :: "X" (x));
    asm ("" : "+X" (x));

In my (non-exhaustive) testing, this gives optimal results on both 
targets, independent of the compiler version and type T.

I'd imagine that the "X" operand doesn't see much use in real inline 
assembly - on x86 and ARM the assembly instruction template would 
usually depend on where the data is put.  But if anyone can explain this 
behaviour to me, I am very curious to know what is going on.

David