>> On 8/15/23 7:19 AM, Jose E. Marchesi wrote: >>> Hello. >>> The selftest progs/verifier_masking.c contains inline assembly code >>> like: >>> w1 = 0xffffffff; >>> The 32-bit immediate of that instruction is signed. Therefore, GAS >>> complains that the above instruction overflows its field: >>> /tmp/ccNOXFQy.s:46: Error: signed immediate out of range, shall >>> fit in 32 bits >>> The llvm assembler is likely relying on signed overflow for the >>> above to >>> work. >> >> Not really. >> >> def _ri_32 : ALU_RI<BPF_ALU, Opc, off, >> (outs GPR32:$dst), >> (ins GPR32:$src2, i32imm:$imm), >> "$dst "#OpcodeStr#" $imm", >> [(set GPR32:$dst, (OpNode GPR32:$src2, >> i32immSExt32:$imm))]>; >> >> >> If generating from source, the pattern [(set GPR32:$dst, (OpNode >> GPR32:$src2, i32immSExt32:$imm))] so value 0xffffffff is not SExt32 >> and it won't match and eventually a LDimm_64 insn will be generated. > > If by "generating from source" you mean compiling from C, then sure, I > wasn't implying clang was generating `r1 = 0xffffffff' for assigning > that positive value to a register. > >> But for inline asm, we will have >> (outs GPR32:$dst) >> (ins GPR32:$src2, i32imm:$imm) >> >> and i32imm is defined as >> def i32imm : Operand<i32>; >> which is a unsigned 32bit value, so it is recognized properly >> and the insn is encoded properly. > > We thought the imm32 operand in ALU instructions is signed, not > unsigned. Is it really unsigned?? I am going through all the BPF instructions that get 32-bit, 16-bit and 64-bit immediates, because it seems to me that we may need to distinguish between two different levels: - Value encoded in the instruction immediate: interpreted as signed or as unsigned. - How the assembler interprets a written number for the corresponding instruction operand: for example, for which instructions the assemler shall accept 0xfffffffe and 4294967294 and -2 all to denote the same value, what value is it (negative or positive) or shall it emit an overflow error. Will follow up with a summary that hopefully will serve to clarify this. >>> Using negative numbers to denote masks is ugly and obfuscating (for >>> non-obvious cases like -1/0xffffffff) so I suggest we introduce a >>> pseudo-op so we can do: >>> w1 = %mask(0xffffffff) >> >> I changed above >> w1 = 0xffffffff; >> to >> w1 = %mask(0xffffffff) >> and hit the following compilation failure. >> >> progs/verifier_masking.c:54:9: error: invalid % escape in inline >> assembly string >> 53 | asm volatile (" \ >> | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> 54 | w1 = %mask(0xffffffff); \ >> | ^ >> 1 error generated. >> >> Do you have documentation what is '%mask' thing? > > It doesn't exist. > > I am suggesting to add support for that pseudo-op to the BPF assemblers: > both GAS and the llvm BPF assembler. > >>> allowing the assembler to do the right thing (TM) converting and >>> checking that the mask is valid and not relying on UB. >>> Thoughts? >>>