On 22/08/2020 16:30, Harald van Dijk wrote:
Hi,
I am trying to port some x86-64 inline assembly to work properly for the
x32 ABI and I am running into a small problem. The x32 ABI specifies
that pointers are passed and returned zero-extended to 64 bits. When a
pointer variable is defined by an inline assembly statement and then
returned by a function, I do not see a way to inform GCC that the result
is already zero-extended, that GCC does not need to zero-extend it again.
A silly example:
void *return_a_pointer(void) {
void *result;
asm("movl $0x11223344, %%eax" : "=a"(result));
return result;
}
This function gets an extra "movl %eax, %eax" between the hand-written
movl and the generated ret, which can be seen online at
<https://godbolt.org/z/T8bGPo>. This extra movl is there to ensure the
high bits of %rax are zero, but the initial movl already achieves that.
How can I inform GCC that it does not need to emit that extra movl?
Likewise, is there an easy way to provide an inline assembly statement
with a zero-extended pointer input? This one I am able to work around,
as it is possible to instead of passing in a pointer value p, pass in an
integer value (uint64_t)(uint32_t)p, but the workaround is kind of hard
to read and I would like to avoid that if possible.
I looked the documentation for either relevant inline assembly
constraints or relevant variable / type attributes, but was unable to
find any. The most promising search result was the mode attribute, I was
hoping it might be possible to give result a mode(DI) attribute, but the
compiler rejects that.
I have now found that forcing a different mode appears to be exactly how
the zero-extension of arguments and return values is implemented: that
is what ix86_promote_function_mode does.
The fact that this is not an option through variable attributes or
inline assembly constraints looks like an unfortunate limitation of the
inline assembly functionality, there is currently just no way to do what
I am after. I very much hope to be proved wrong, but will try to just
pick a workaround that does not look too bad.
Is there another approach that I can use instead?
Cheers,
Harald van Dijk