> Hi Yonghong. > Thanks for the proposal! > >> SDIV/SMOD (signed div and mod) >> ============================== >> >> bpf already has unsigned DIV and MOD. They are encoded as >> >> insn code(4 bits) source(1 bit) instruction class(3 bit) >> off(16 bits) >> DIV 0x3 0/1 BPF_ALU/BPF_ALU64 0 >> MOD 0x9 0/1 BPF_ALU/BPF_ALU64 0 >> >> The current 'code' field only has two value left, 0xe and 0xf. >> gcc used these two values (0xe and 0xf) for SDIV and SMOD. >> But using these two values takes up all 'code' space and makes >> future extension hard. >> >> Here, I propose to encode SDIV/SMOD like below: >> >> insn code(4 bits) source(1 bit) instruction class(3 bit) >> off(16 bits) >> DIV 0x3 0/1 BPF_ALU/BPF_ALU64 1 >> MOD 0x9 0/1 BPF_ALU/BPF_ALU64 1 >> >> Basically, we reuse the same 'code' value but changing 'off' from 0 to 1 >> to indicate signed div/mod. > > I have a general concern about using instruction operands to encode > opcodes (in this case, 'off'). > > At the moment we have two BPF instruction formats: > > - The 64-bit instructions: > > code:8 regs:8 offset:16 imm:32 > > - The 128-bit instructions: > > code:8 regs:8 offset:16 imm:32 unused:32 imm:32 > > Of these, `code', `regs' and `unused' are what is commonly known as > instruction fields. These are typically used for register numbers, > flags, and opcodes. > > On the other hand, offset, imm32 and imm:32:::imm:32 are instruction > operands (the later is non-contiguous and conforms the 64-bit operand in > the 128-bit instruction). > > The main difference between these is that the bytes conforming > instruction operands are themselves impacted by endianness, on top on > the endianness effect on the whole instruction. (The weird endian-flip > in the two nibbles of `regs' is unfortunate, but I guess there is > nothing we can do about it at this point and I count them as > non-operands.) > > If you use an instruction operand (such as `offset') in order to act as > an opcode, you incur in two inconveniences: > > 1) In effect you have "moving" opcodes that depend on the endianness. > The opcode for signed-operation will be 0x1 in big-endian BPF, but > 0x8000 in little-endian bpf. > > 2) You lose the ability of easily adding more complementary opcodes in > these 16 bits in the future, in case you ever need them. > > As far as I have seen in other architectures, the usual way of doing > this is to add an additional instruction format, in this case for the > class of arithmetic instructions, where the bits dedicated to the unused > operand (offset) becomes a new opcodes field: > > - 32-bit arithmetic instructions: > > code:8 regs:8 code2:16 imm:32 > > Where code2 is now an additional field (not an operand) that provides > extra additional opcode space for this particular class of instructions. > This can be divided in a 1-bit field to signify "signed" and the rest > reserved for future use: > > opcode2 ::= unused(15) signed(1) Actually this would be just for DIV/MOD instructions, so the new format should only apply to them. The new format would be something like: - 64-bit ALU/ALU64 div/mod instructions (code=3,9): code:8 regs:8 unused:15 signed:1 imm:32 And for the rest of the ALU and ALU64 instructions (code=0,1,2,4,5,6,7,8,a,b,c,d): - 64-bit ALU/ALU64 instructions: code:8 regs:8 unused:16 imm:32