Re: [x86 inline asm]: width of register arguments

"Zdenek Sojka" <zsojka@xxxxxxxxx> · Tue, 02 Jul 2019 07:37:29 +0200 (CEST)

Hello Andrew,

---------- Původní e-mail ----------
Od: Andrew Haley <aph@xxxxxxxxxx>
Komu: Zdenek Sojka <zsojka@xxxxxxxxx>, gcc-help@xxxxxxxxxxx
Datum: 29. 6. 2019 18:11:20
Předmět: Re: [x86 inline asm]: width of register arguments
"Hi,

On 6/24/19 1:19 PM, Zdenek Sojka wrote:
> how does gcc choose the register arguments of an inline assembler and what

> can I assume about the "unused" bits?

The choice is made by the register allocator.

You can't assume anything about the "unused" bits. The "r" register
constraint means you get a whole register to use, of the wordsize of
the machine.
"

Ok, that's a very important information!

I was a bit afraid that the compiler might assume the upper bits are eg. 
zeroed, if they were zeroed before the __asm__ statement. (or that high-
order bits might be sign-extension of the narrower value)

"

> My questions target the 64bit x86 architecture; I assume the behavior is
the
> same for all target triplets x86_64-*-*
>
> 1) does gcc always use register of size matching the size of the variable?

No. "

Ok, shame - it seems to behave so in my experiments:

void foo(void)
{
        uint8_t u8; uint16_t u16; uint32_t u32; uint64_t u64;
        __asm__ volatile ("# %0 %1 %2 %3" : "=r"(u8), "=r"(u16), "=r"(u32),
"=r"(u64));
        __asm__ volatile ("# %0 %1 %2 %3" : "+r"(u8), "+r"(u16), "+r"(u32),
"+r"(u64));
        __asm__ volatile ("# %0 %1 %2 %3" :: "r"(u8), "r"(u16), "r"(u32),
"r"(u64));
}

generates at all optimization levels (-O0 to -O3 8bit reg for u8, 16bit reg
for u16, 32bit reg for u32, 64bit reg for u64:

# 9 "tsts.c" 1
        # %al %dx %ecx %rsi
# 0 "" 2
# 10 "tsts.c" 1
        # %al %dx %ecx %rsi
# 0 "" 2
# 11 "tsts.c" 1
        # %al %dx %ecx %rsi
# 0 "" 2

Similar for:

void bar(uint8_t u8, uint16_t u16, uint32_t u32, uint64_t u64)
{
        __asm__ volatile ("# %0 %1 %2 %3" : "=r"(u8), "=r"(u16), "=r"(u32),
"=r"(u64));
        __asm__ volatile ("# %0 %1 %2 %3" : "+r"(u8), "+r"(u16), "+r"(u32),
"+r"(u64));
        __asm__ volatile ("# %0 %1 %2 %3" :: "r"(u8), "r"(u16), "r"(u32),
"r"(u64));
}

and for

void baz64(uint64_t a, uint64_t b, uint64_t c, uint64_t d)
{
        __asm__ volatile ("# %0 %1 %2 %3" :: "r"((uint8_t)a), "r"((uint16_t)
b), "r"((uint32_t)c), "r"((uint64_t)d));
}

void baz8(uint8_t a, uint8_t b, uint8_t c, uint8_t d)
{
        __asm__ volatile ("# %0 %1 %2 %3" :: "r"((uint8_t)a), "r"((uint16_t)
b), "r"((uint32_t)c), "r"((uint64_t)d));
}

always uses 8bit register for a, 16bit register for b, 32bit register for c,
64bit register for d.

Do you happen to know of any counter-example?

"

> 2) can I assume anything about the high-order bits of the register? can I

> overwrite them freely?

No; yes.

> 2a) does gcc use the "high" 8bit registers (ah, bh, ch, dh) for variable
> allocation?

No. "

Ok, thanks, another important information.

"

> 2b) can gcc allocate different 8bit variables in the "low" and "high" 
> registers (eg. al/ah, bl/bh, ...)?
>
>
> For variables of type:
>
> uint8_t a8, b8;
> uint16_t a16, b16;
> ...

I think not, but I'm unsure.
"

According to 2a), ah/bh/... are not used for register alocation -> so "No."

"
> Enforcing same-sized arguments:
> a)
> __asm__ ("movb %b1, %b0" : "=r"(a8) : "r"(b8));
> or
> __asm__ ("movq %q1, %q0" : "=r"(a8) : "r"(b8));
> is always safe to do? (eg. moving 56bits of garbage won't hurt anything)
> OR might gcc assume something about the high-order 56bits (eg. zero, sign-
/
> zero-extension of the lower 8 bits), which might get broken by the move?

If you ask for a register with the "r" constraint, all of that
register is yours to use.

> Assuming zero-extension:
> __asm__ ("movw %w1, %w0" : "=r"(a16) : "r"((uint8_t)b16));
> or
> __asm__ ("movw %w1, %w0" : "=r"(a16) : "r"(b8));
> does not seem to work (high-order 8 bits of a16 are garbage)

That's how x86 works.
"

Garbage in, garbage out. The high order bits are undefined.

"
--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 "

Best regards,
Zdenek Sojka