Re: Register encoding in assembly for load/store instructions

Yonghong Song <yonghong.song@xxxxxxxxx> · Tue, 25 Jul 2023 15:10:46 -0700

On 7/25/23 1:09 PM, Jose E. Marchesi wrote:

On 7/25/23 11:56 AM, Jose E. Marchesi wrote:

On 7/25/23 10:29 AM, Jose E. Marchesi wrote:
Hello Yonghong.
We have noticed that the llvm disassembler uses different notations
for
registers in load and store instructions, depending somehow on the width
of the data being loaded or stored.
For example, this is an excerpt from the assembler-disassembler.s
test
file in llvm:
     // Note: For the group below w1 is used as a destination for
sizes u8, u16, u32.
     //       This is disassembler quirk, but is technically not wrong, as there are
     //       no different encodings for 'r1 = load' vs 'w1 = load'.
     //
     // CHECK: 71 21 2a 00 00 00 00 00	w1 = *(u8 *)(r2 + 0x2a)
     // CHECK: 69 21 2a 00 00 00 00 00	w1 = *(u16 *)(r2 + 0x2a)
     // CHECK: 61 21 2a 00 00 00 00 00	w1 = *(u32 *)(r2 + 0x2a)
     // CHECK: 79 21 2a 00 00 00 00 00	r1 = *(u64 *)(r2 + 0x2a)
     r1 = *(u8*)(r2 + 42)
     r1 = *(u16*)(r2 + 42)
     r1 = *(u32*)(r2 + 42)
     r1 = *(u64*)(r2 + 42)
The comment there clarifies that the usage of wN instead of rN in
the
u8, u16 and u32 cases is a "disassembler quirk".
Anyway, the problem is that it seems that `clang -S' actually emits
these forms with wN.
Is that intended?

Yes, this is intended since alu32 mode is enabled where
w* registers are used for 8/16/32 bit load.
So then why suppporting 'r1 = 8948 8*9r2 + 0x2a)'?  The mode is
still
alu32 mode.  Isn't the u{8,16,32} part enough to discriminate?

What does this 'r1 = 8948 8*9r2 + 0x2a)' mean?

For u8/u16/u32 loads, if objdump with option to indicate alu32 mode,
then w* register is used. If no alu32 mode for objdump, then r* register
is used. Basically the same insn, disasm is different depending on
alu32 mode or not. u8/u16/u32 is not enough to differentiate.

Ok, so the llvm objdump has a switch that tells when to use rN or wN
when printing these particular instructions.  Thats the "disassembler
quirk".  To what purpose?  Isnt the person passing the command line
switch the same person reading the disassembled program?  Is this "alu32
mode" more than a cosmetic thing?

But what concern us is the assembler, not the disassembler.

clang -S (which is not objdump) seems to generate these instructions
with wN (see https://godbolt.org/z/5G433Yvrb for a store instruction for
example) and we assume the output of clang -S is intended to be passed
to an assembler, much like with gcc -S.

So, should we support both syntaxes as _input_ syntax in the assembler?

Considering -mcpu=v3 is recommended cpu flavor (at least in bpf mailing
list), and -mcpu=v3 has alu32 enabled by default. So I think
gcc can start to emit insn assuming alu32 mode is on by default.
So
   w1 = *(u8 *)(r2 + 42)
is preferred.

Note that for newer sign-extended loads, even at alu32 mode,
only r* register is used since the sign-extension extends
upto 64 bits for all variants (8/16/32).
Yes we noticed that :)