On 05/10/2020 13:53, Segher Boessenkool wrote:
On Sat, Oct 03, 2020 at 09:42:40PM +0100, Harald van Dijk via Gcc-help wrote:
On 22/08/2020 16:30, Harald van Dijk wrote:
This function gets an extra "movl %eax, %eax" between the hand-written
movl and the generated ret, which can be seen online at
<https://godbolt.org/z/T8bGPo>. This extra movl is there to ensure the
high bits of %rax are zero, but the initial movl already achieves that.
How can I inform GCC that it does not need to emit that extra movl?
If your asm returns a 32-bit value, then GCC will not know what is in
the top 32 bits of the 64-bit register.
Likewise, is there an easy way to provide an inline assembly statement
with a zero-extended pointer input? This one I am able to work around,
as it is possible to instead of passing in a pointer value p, pass in an
integer value (uint64_t)(uint32_t)p, but the workaround is kind of hard
to read and I would like to avoid that if possible.
You can use much less chatty names for those very basic types (like u64
and u32), that makes it more readable. Hiding what you do will not make
it more readable, that is just obfuscation, so using macros is not such
a good idea. Inline asm is hard enough when you can see all there is
right in front of your eyes.
I am not trying to hide anything. I am trying to pass in a pointer value
in the same way that they are passed in all other cases. In the x32 ABI,
the way pointer values are passed is by passing them in 64-bit registers
with the high 32 bits zeroed, except in inline assembly. I'm trying to
get the inline assembly to not be a special case.
I looked the documentation for either relevant inline assembly
constraints or relevant variable / type attributes, but was unable to
find any. The most promising search result was the mode attribute, I was
hoping it might be possible to give result a mode(DI) attribute, but the
compiler rejects that.
Constraints just say which register (or memory addressed how, or what
kind of constnt). The normal way to say something should have a certain
mode is by giving it a corresponding type in C (so SImode is "int" in C,
and DImode is "long long"; "long" is either, it depends on your ABI;
"u64" and "u32" should always be clear ;-) )
In the x32 ABI, pointers do not have a single mode. They are SImode,
except when passed as parameters or returned, in which case they are
DImode (see the ix86_promote_function_mode function I mentioned). This
is really the source of the problems.
The value that I want to return is a pointer, and the assembly is
returning a pointer. There should not be any need for extra code to
convert the value returned from the assembly to the value returned from
the function, since they are already the same type, but there is,
because they are not the same mode.
I have now found that forcing a different mode appears to be exactly how
the zero-extension of arguments and return values is implemented: that
is what ix86_promote_function_mode does.
The fact that this is not an option through variable attributes or
inline assembly constraints looks like an unfortunate limitation of the
inline assembly functionality, there is currently just no way to do what
I am after. I very much hope to be proved wrong, but will try to just
pick a workaround that does not look too bad.
Is there another approach that I can use instead?
You use a 64-bit expression (or preferably even a 64-bit variable). The
same is true for outputs from the asm.
That does not work for outputs though. Without informing GCC somehow
that the high 32 bits of that 64-bit expression are zero, it will still
emit an extra extension.
One way of making the code easier to read is to actually use 64-bit
variables for all these things in the asm, and then assign them from and
to the 32-bit things.
int x;
char *p;
u64 xx = x;
u64 pp = (u64)p;
asm("smth %0,%1" : "+r"(pp), "+r"(xx));
p = (char *)pp;
x = xx;
See <https://godbolt.org/z/Kx6d9q> for what happens when I modify the
example I had provided to use a 64-bit variable like you suggested: that
requires adding a cast in the return statement, and now it's that
conversion that forces the extra "movl %eax, %eax". It cannot be avoided
that way.
Segher