I was testing a little by looking at some code generation for
optimisations and re-arrangements when -ffast-math is enabled, using
code like this in <https://godbolt.org> :
typedef float T;
T test(T a, T b) {
T x = a + b;
//asm ("" :: "X" (x));
asm ("" : "+X" (x));
//asm ("" : "+X" (x) : "X" (x));
return x - b;
}
Without the asm statements, gcc - as expected - skips the calculation of
"x" and can then simplify "a + b - b" to "a". I have previously used
inline assembly of the form:
asm ("" : "+g" (x));
to tell gcc "You need to calculate x before running this assembly and
put it in a general register or memory, but it might change during the
assembly so you must forget anything you knew about it before". I've
found it useful to force particular orders on calculations, or for
debugging, or as a kind of fine-tuned alternative to a memory barrier.
But the "+g" operand is not ideal for floating point variables - it
forces the compiler to move the variable from a floating point register
into a general-purpose register, then back again. The ideal choice
seems to be "+X", since "X" matches any operand whatsoever.
However, when I use just "asm ("" : "+X" (x));", I get an error message
"error: inconsistent operand constraints in an 'asm'". I have no idea
why this is an issue.
Getting weirder, on x86-64, there is no error if I use
asm ("" : "+X" (x) : "X" (x));
This gives me the desired effect of forcing "x" to be calculated and
used in the final "x - b".
Even weirder, on 32-bit ARM, this still gives the inconsistent operand
error.
Weirder still, this works error-free on both targets :
asm ("" :: "X" (x));
asm ("" : "+X" (x));
In my (non-exhaustive) testing, this gives optimal results on both
targets, independent of the compiler version and type T.
I'd imagine that the "X" operand doesn't see much use in real inline
assembly - on x86 and ARM the assembly instruction template would
usually depend on where the data is put. But if anyone can explain this
behaviour to me, I am very curious to know what is going on.
David