Over the past, there are some discussions to extend bpf instruction ISA to accommodate some new use cases or fix some potential issues. These new instructions will be included in new cpu flavor -mcpu=v4. The following are the proposal to add new instructions in 6 different categories. The proposal is a little bit rough. You can find bpf insn background information in Documentation/bpf/instruction-set.rst. Compared to previous proposal (v1) in https://lore.kernel.org/bpf/01515302-c37d-2ee5-c950-2f556a4caad0@xxxxxxxx/ there are two changes: . for sign extend load, removing alu32_mode differentiator since alu32_mode is only a compiler asm syntax mechanism in this case, and not involved in insn encoding. . for sign extend mov, there is no support for sign extend moving an imm to a register. The corresponding llvm implementation is at https://reviews.llvm.org/D144829 The following is the proposal details. SDIV/SMOD (signed div and mod) ============================== bpf already has unsigned DIV and MOD. They are encoded as insn code(4 bits) source(1 bit) instruction class(3 bit) off(16 bits) DIV 0x3 0/1 BPF_ALU/BPF_ALU64 0 MOD 0x9 0/1 BPF_ALU/BPF_ALU64 0 The current 'code' field only has two value left, 0xe and 0xf. gcc used these two values (0xe and 0xf) for SDIV and SMOD. But using these two values takes up all 'code' space and makes future extension hard. Here, I propose to encode SDIV/SMOD like below: insn code(4 bits) source(1 bit) instruction class(3 bit) off(16 bits) DIV 0x3 0/1 BPF_ALU/BPF_ALU64 1 MOD 0x9 0/1 BPF_ALU/BPF_ALU64 1 Basically, we reuse the same 'code' value but changing 'off' from 0 to 1 to indicate signed div/mod. Sign extend load ================ Currently llvm generated normal load instructions are encoded like below. mode(3 bits) size(2 bits) instruction class(3 bits) BPF_MEM (0x3) 8/16/32/64 BPF_LDX For mode, existing used values are 0x0, 0x1, 0x2, 0x3, 0x6. The proposal is to use mod value 0x4 to encode sign extend loads. mode(3 bits) size(2 bits) instruction class(3 bits) BPF_SMEM (0x4) 8/16/32 BPF_LDX Sign extend register mov ======================== Current BPF_MOV insn is encoded as insn code(4 bits) source(1 bit) instruction class(3 bit) off(16 bits) MOV 0xb 0/1 BPF_ALU/BPF_ALU64 0 Let us support sign extended move insn as defined below: insn code(4 bits) source(1 bit) instruction class(3 bit) off(16 bits) MOVS 0xb 1 BPF_ALU 8/16 MOVS 0xb 1 BPF_ALU64 8/16/32 In the above sign extended mov instruction, 'off' represents the 'size'. For example, if BPF_ALU class, and 'off' is 8, which means sign extend a 8-bit value (in register) to a 32-bit value. If BPF_ALU64 class, the same 8-bit value will sign extend to a 64-bit value. 32-bit JA ========= Currently, the whole range of operations with BPF_JMP32/BPF_JMP insn are implemented like below ======== ===== ========================= ============ code value description notes ======== ===== ========================= ============ BPF_JA 0x00 PC += off BPF_JMP only BPF_JEQ 0x10 PC += off if dst == src BPF_JGT 0x20 PC += off if dst > src unsigned BPF_JGE 0x30 PC += off if dst >= src unsigned BPF_JSET 0x40 PC += off if dst & src BPF_JNE 0x50 PC += off if dst != src BPF_JSGT 0x60 PC += off if dst > src signed BPF_JSGE 0x70 PC += off if dst >= src signed BPF_CALL 0x80 function call BPF_EXIT 0x90 function / program return BPF_JMP only BPF_JLT 0xa0 PC += off if dst < src unsigned BPF_JLE 0xb0 PC += off if dst <= src unsigned BPF_JSLT 0xc0 PC += off if dst < src signed BPF_JSLE 0xd0 PC += off if dst <= src signed ======== ===== ========================= ============ Here the 'off' is 16 bit so the range of jump is [-32768, 32767]. In rare cases, people may have large programs or have loops fully unrolled. This may cause some jump offset beyond the above range. In current llvm implementation, wrong code (after truncation) will be generated in earlier llvm or a fatal error will be generated for recent llvm. To fix this issue, the following new insn is proposed ======== ===== ========================= ============ code value description notes ======== ===== ========================= ============ BPF_JA 0x00 PC += imm BPF_JMP32 only The way, the jump offset range become [-2^31, 2^31 - 1]. For other jump instructions, e.g., BPF_JEQ, with a jmp offset beyond [-32768, 32767]. It can be simulated with BPF_JEQ with a short range followed by a BPF_JA. bswap16/32/64 ============= Currently, llvm does not generate bswap16/32/64 properly. Rather it generates be16/32/64 and le16/32/64 instructions based on endianness of the current bpf target in compilation. The existing encode looks below: bpf target insn code source insn_class imm big endian LE 0xd LE(0) BPF_ALU 16/32/64 little endian BE 0xd BE(1) BPF_ALU 16/32/64 LE insn will do swap if the running target is big endian. BE insn will do swap if the running target is little endian. See kernel/bpf/core.c for details. The new bswap instruction will have the following encoding: insn code source insn_class imm BSWAP 0xd 0 BPF_ALU64 16/32/64 The BSWAP insn will be swap unconditionally. ST == The kernel has already supported BPF_ST insn like below, mode(3 bits) size(2 bits) instruction class(3 bits) BPF_MEM (0x3) 8/16/32/64 BPF_ST The semantics is: *(size *) (dst_reg + off) = imm32 LLVM just needs to implement this instruction under -mcpu=v4. looks like gcc can already generate this instruction.