On Wed, Jul 19, 2023 at 5:53 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Tue, 2023-07-18 at 18:17 -0700, Yonghong Song wrote: > [...] > > > > > +static void emit_movsx_reg(u8 **pprog, int num_bits, bool is64, u32 dst_reg, > > > > > + u32 src_reg) > > > > > +{ > > > > > + u8 *prog = *pprog; > > > > > + > > > > > + if (is64) { > > > > > + /* movs[b,w,l]q dst, src */ > > > > > + if (num_bits == 8) > > > > > + EMIT4(add_2mod(0x48, src_reg, dst_reg), 0x0f, 0xbe, > > > > > + add_2reg(0xC0, src_reg, dst_reg)); > > > > > + else if (num_bits == 16) > > > > > + EMIT4(add_2mod(0x48, src_reg, dst_reg), 0x0f, 0xbf, > > > > > + add_2reg(0xC0, src_reg, dst_reg)); > > > > > + else if (num_bits == 32) > > > > > + EMIT3(add_2mod(0x48, src_reg, dst_reg), 0x63, > > > > > + add_2reg(0xC0, src_reg, dst_reg)); > > > > > + } else { > > > > > + /* movs[b,w]l dst, src */ > > > > > + if (num_bits == 8) { > > > > > + EMIT4(add_2mod(0x40, src_reg, dst_reg), 0x0f, 0xbe, > > > > > + add_2reg(0xC0, src_reg, dst_reg)); > > > > > > Nit: As far as I understand 4-126 Vol. 2B of [1] > > > the 0x40 prefix (REX prefix) is optional here > > > (same as implemented below for num_bits == 16). > > > > I think 0x40 prefix at least neededif register is from R8 - R15? > > Yes, please see below. > > > I use this website to do asm/disasm experiments and did > > try various combinations with first 8 and later 8 registers > > and it seems correct results are generated. > > It seems all roads lead to that web-site, I used it as well :) > Today I learned that the following could be used: > > echo 'movsx rax,ax' | as -o /dev/null -aln -msyntax=intel -mnaked-reg > > Which opens a road to scripting experiments. This internal tool from llvm-project may also be useful:) llvm-mc -triple=x86_64 -show-inst -x86-asm-syntax=intel -output-asm-variant=1 <<< 'movsx rax, ax' > > > > > > [1] https://cdrdv2.intel.com/v1/dl/getContent/671200 > > > > > > > > > > > + } else if (num_bits == 16) { > > > > > + if (is_ereg(dst_reg) || is_ereg(src_reg)) > > > > > + EMIT1(add_2mod(0x40, src_reg, dst_reg)); > > > > > + EMIT3(add_2mod(0x0f, src_reg, dst_reg), 0xbf, > > > > > > Nit: Basing on the same manual I don't understand why > > > add_2mod(0x0f, src_reg, dst_reg) is used, '0xf' should suffice > > > (but I tried it both ways and it works...). > > > > From the above online assembler website. > > > > But I will check the doc to see whether it can be simplified. > > I tried all combinations of r0..r9 for 64/32-bit destinations, > 32/16/8 sources [1]: > - 0x40 based prefix is generated if any of the following is true: > - dst is 64 bit > - dst is ereg > - src is ereg > - dst is 32-bit and src is 'sil' (part of 'rsi', used for r2) > (!) This one is surprising and web-site shows the same results. > For example `movsx eax,sil` is encoded as `40 0F BE C6`, > disassembling `0F BE C6` (w/o prefix) gives `movsx eax,dh`. > - opcodes: > - 63 64-bit dst, 32-bit src > - 0F BF 64-bit dst, 16-bit src > - 0F BE 64-bit dst, 8-bit src > - 0F BF 32-bit dst, 16-bit src (same as 64-bit dst) > - 0F BE 32-bit dst, 8-bit src (same as 64-bit dst) > > Script is at [2] (it is not particularly interesting, but in case if > you want to tweak it). > > [1] https://gist.github.com/eddyz87/94b35fd89f023c43dd2480e196b28ea1 > [2] https://gist.github.com/eddyz87/60991379c547df11d30fa91901862227 > > > > > > + add_2reg(0xC0, src_reg, dst_reg)); > > > > > + } > > > > > + } > > > > > + > > > > > + *pprog = prog; > > > > > +} > [...] -- 宋方睿