> On 8/17/23 10:37 AM, Jose E. Marchesi wrote: >> >>> On 8/17/23 9:23 AM, Yonghong Song wrote: >>>> On 8/17/23 1:01 AM, Jose E. Marchesi wrote: >>>>> >>>>>> [...] >>>>>> In llvm, for inline asm, 0xfffffffe, 4294967294 and -2 have the same >>>>>> 4-byte bit-wise encoding, so they will be all encoded the same >>>>>> 0xfffffffe in the actual insn. >>>>>> >>>>>> The following is an example for x86 target in llvm: >>>>>> >>>>>> $ cat t.c >>>>>> int foo() { >>>>>> int a, b; >>>>>> >>>>>> asm volatile("movl $0xfffffffe, %0" : "=r"(a) :); >>>>>> asm volatile("movl $-2, %0" : "=r"(b) :); >>>>>> return a + b; >>>>>> } >>>>>> $ clang -O2 -c t.c >>>>>> $ llvm-objdump -d t.o >>>>>> >>>>>> t.o: file format elf64-x86-64 >>>>>> >>>>>> Disassembly of section .text: >>>>>> >>>>>> 0000000000000000 <foo>: >>>>>> 0: b9 fe ff ff ff movl $0xfffffffe, %ecx # >>>>>> imm = 0xFFFFFFFE >>>>>> 5: b8 fe ff ff ff movl $0xfffffffe, %eax # >>>>>> imm = 0xFFFFFFFE >>>>>> a: 01 c8 addl %ecx, %eax >>>>>> c: c3 retq >>>>>> $ >>>>>> >>>>>> Whether it is 0xfffffffe or -2, the insn encoding is the same >>>>>> and disasm prints out 0xfffffffe. >>>>> >>>>> Thanks for the explanation. >>>>> >>>>> I have pushed the commit below to binutils that makes GAS match the llvm >>>>> assembler behavior regarding constant immediates. With this patch there >>>>> are no more assembler errors when building the kernel bpf selftests. >>>> Great! Thanks. >>>> >>>>> >>>>> Note however that there is one pending divergence in the behavior of >>>>> both assemblers when facing invalid programs where immediate operands >>>>> cannot be represented in the number of bits of the field like in: >>>>> >>>>> $ cat foo.s >>>>> if r1 > r2 goto 0x3fff1 >>>>> >>>>> llvm silently truncates it to 16-bit: >>>>> >>>>> $ clang -target bpf foo.s >>>>> $ bpf-unkonwn-none-objdump -M pseudoc -dr foo.o >>>>> 0000000000000000 <.text>: >>>>> 0: 2d 21 f1 ff 00 00 00 00 if r1>r2 goto -15 >>>>> >>>>> GAS emits an error instead: >>>>> >>>>> $ as -mdialect=pseudoc foo.s >>>>> foo.s: Assembler messages: >>>>> foo.s:1: Error: pc-relative offset out of range, shall fit in 16 bits. >>>>> >>>>> (The same happens with 32-bit immediates.) >>>>> >>>>> We think the error is pertinent, and we recommend the llvm assembler to >>>>> behave the same way. >>>> Thanks! We will take a look at this issue soon. >>> >>> A patch like below can issue the warning for the above case: >>> >>> diff --git a/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp >>> b/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp >>> index 420a2aad480a..fca6bf30fb4b 100644 >>> --- a/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp >>> +++ b/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp >>> @@ -136,6 +136,12 @@ void BPFMCCodeEmitter::encodeInstruction(const >>> MCInst &MI, >>> OSE.write<uint16_t>(0); >>> OSE.write<uint32_t>(Imm >> 32); >>> } else { >>> + if (Opcode == BPF::JUGT_rr) { >>> + const MCOperand &MO = MI.getOperand(2); >>> + int64_t Imm = MO.isImm() ? MO.getImm() : 0; >>> + if (Imm > INT16_MAX || Imm < INT16_MIN) >> Shouldn't that be: >> if (Imm > UINT16_MAX || Imm < INT16_MIN) > > The number 'Imm' represents true offset (positive or negative) > as represented in .s file. > So positive offset 0xfffffffe cannot be presented. > The encoding in insn with 0xfffffffe actually means -2. Oh ok, so thats the value already encoded :) >> ? >> >>> + report_fatal_error("Branch target out of insn range"); >>> + } >>> // Get instruction encoding and emit it >>> uint64_t Value = getBinaryCodeForInstr(MI, Fixups, STI); >>> CB.push_back(Value >> 56); >>> >>> Need to generalize to other related conditional/unconditional >>> operands. Will have a formal patch for llvm soon. >>> >>> Thanks. >>