>> On 7/25/23 10:29 AM, Jose E. Marchesi wrote: >>> Hello Yonghong. >>> We have noticed that the llvm disassembler uses different notations >>> for >>> registers in load and store instructions, depending somehow on the width >>> of the data being loaded or stored. >>> For example, this is an excerpt from the assembler-disassembler.s >>> test >>> file in llvm: >>> // Note: For the group below w1 is used as a destination for >>> sizes u8, u16, u32. >>> // This is disassembler quirk, but is technically not wrong, as >>> there are >>> // no different encodings for 'r1 = load' vs 'w1 = load'. >>> // >>> // CHECK: 71 21 2a 00 00 00 00 00 w1 = *(u8 *)(r2 + 0x2a) >>> // CHECK: 69 21 2a 00 00 00 00 00 w1 = *(u16 *)(r2 + 0x2a) >>> // CHECK: 61 21 2a 00 00 00 00 00 w1 = *(u32 *)(r2 + 0x2a) >>> // CHECK: 79 21 2a 00 00 00 00 00 r1 = *(u64 *)(r2 + 0x2a) >>> r1 = *(u8*)(r2 + 42) >>> r1 = *(u16*)(r2 + 42) >>> r1 = *(u32*)(r2 + 42) >>> r1 = *(u64*)(r2 + 42) >>> The comment there clarifies that the usage of wN instead of rN in >>> the >>> u8, u16 and u32 cases is a "disassembler quirk". >>> Anyway, the problem is that it seems that `clang -S' actually emits >>> these forms with wN. >>> Is that intended? >> >> Yes, this is intended since alu32 mode is enabled where >> w* registers are used for 8/16/32 bit load. > > So then why suppporting 'r1 = 8948 8*9r2 + 0x2a)'? The mode is still > alu32 mode. Isn't the u{8,16,32} part enough to discriminate? Sorry my keyboard num-lock activated mid-sentence. I meant 'r1 = (u8*)(r2 + 42)'. Why supporting that syntax as well as 'w1 = (u8*)(r2 + 42)'? > >> Note that for newer sign-extended loads, even at alu32 mode, >> only r* register is used since the sign-extension extends >> upto 64 bits for all variants (8/16/32). > > Yes we noticed that :) > >> >> >> >>>