Hello Andrew, ---------- Původní e-mail ---------- Od: Andrew Haley <aph@xxxxxxxxxx> Komu: Zdenek Sojka <zsojka@xxxxxxxxx>, gcc-help@xxxxxxxxxxx Datum: 29. 6. 2019 18:11:20 Předmět: Re: [x86 inline asm]: width of register arguments "Hi, On 6/24/19 1:19 PM, Zdenek Sojka wrote: > how does gcc choose the register arguments of an inline assembler and what > can I assume about the "unused" bits? The choice is made by the register allocator. You can't assume anything about the "unused" bits. The "r" register constraint means you get a whole register to use, of the wordsize of the machine. " Ok, that's a very important information! I was a bit afraid that the compiler might assume the upper bits are eg. zeroed, if they were zeroed before the __asm__ statement. (or that high- order bits might be sign-extension of the narrower value) " > My questions target the 64bit x86 architecture; I assume the behavior is the > same for all target triplets x86_64-*-* > > 1) does gcc always use register of size matching the size of the variable? No. " Ok, shame - it seems to behave so in my experiments: void foo(void) { uint8_t u8; uint16_t u16; uint32_t u32; uint64_t u64; __asm__ volatile ("# %0 %1 %2 %3" : "=r"(u8), "=r"(u16), "=r"(u32), "=r"(u64)); __asm__ volatile ("# %0 %1 %2 %3" : "+r"(u8), "+r"(u16), "+r"(u32), "+r"(u64)); __asm__ volatile ("# %0 %1 %2 %3" :: "r"(u8), "r"(u16), "r"(u32), "r"(u64)); } generates at all optimization levels (-O0 to -O3 8bit reg for u8, 16bit reg for u16, 32bit reg for u32, 64bit reg for u64: # 9 "tsts.c" 1 # %al %dx %ecx %rsi # 0 "" 2 # 10 "tsts.c" 1 # %al %dx %ecx %rsi # 0 "" 2 # 11 "tsts.c" 1 # %al %dx %ecx %rsi # 0 "" 2 Similar for: void bar(uint8_t u8, uint16_t u16, uint32_t u32, uint64_t u64) { __asm__ volatile ("# %0 %1 %2 %3" : "=r"(u8), "=r"(u16), "=r"(u32), "=r"(u64)); __asm__ volatile ("# %0 %1 %2 %3" : "+r"(u8), "+r"(u16), "+r"(u32), "+r"(u64)); __asm__ volatile ("# %0 %1 %2 %3" :: "r"(u8), "r"(u16), "r"(u32), "r"(u64)); } and for void baz64(uint64_t a, uint64_t b, uint64_t c, uint64_t d) { __asm__ volatile ("# %0 %1 %2 %3" :: "r"((uint8_t)a), "r"((uint16_t) b), "r"((uint32_t)c), "r"((uint64_t)d)); } void baz8(uint8_t a, uint8_t b, uint8_t c, uint8_t d) { __asm__ volatile ("# %0 %1 %2 %3" :: "r"((uint8_t)a), "r"((uint16_t) b), "r"((uint32_t)c), "r"((uint64_t)d)); } always uses 8bit register for a, 16bit register for b, 32bit register for c, 64bit register for d. Do you happen to know of any counter-example? " > 2) can I assume anything about the high-order bits of the register? can I > overwrite them freely? No; yes. > 2a) does gcc use the "high" 8bit registers (ah, bh, ch, dh) for variable > allocation? No. " Ok, thanks, another important information. " > 2b) can gcc allocate different 8bit variables in the "low" and "high" > registers (eg. al/ah, bl/bh, ...)? > > > For variables of type: > > uint8_t a8, b8; > uint16_t a16, b16; > ... I think not, but I'm unsure. " According to 2a), ah/bh/... are not used for register alocation -> so "No." " > Enforcing same-sized arguments: > a) > __asm__ ("movb %b1, %b0" : "=r"(a8) : "r"(b8)); > or > __asm__ ("movq %q1, %q0" : "=r"(a8) : "r"(b8)); > is always safe to do? (eg. moving 56bits of garbage won't hurt anything) > OR might gcc assume something about the high-order 56bits (eg. zero, sign- / > zero-extension of the lower 8 bits), which might get broken by the move? If you ask for a register with the "r" constraint, all of that register is yours to use. > Assuming zero-extension: > __asm__ ("movw %w1, %w0" : "=r"(a16) : "r"((uint8_t)b16)); > or > __asm__ ("movw %w1, %w0" : "=r"(a16) : "r"(b8)); > does not seem to work (high-order 8 bits of a16 are garbage) That's how x86 works. " Garbage in, garbage out. The high order bits are undefined. " -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 " Best regards, Zdenek Sojka