bpf: Propose some new instructions for -mcpu=v4

Yonghong Song <yhs@xxxxxxxx> · Thu, 9 Feb 2023 14:54:52 -0800

Over the past, there are some discussions to extend bpf
instruction ISA to accommodate some new use cases or
fix some potential issues. These new instructions will
be included in new cpu flavor -mcpu=v4.

The following are the proposal
to add new instructions in 6 different categories.
The proposal is a little bit rough. You can find bpf insn
background information in Documentation/bpf/instruction-set.rst.
You comments or suggestions are welcome!

SDIV/SMOD (signed div and mod)
==============================

bpf already has unsigned DIV and MOD. They are encoded as

  insn    code(4 bits)     source(1 bit)     instruction class(3 bit) 
off(16 bits)
  DIV     0x3              0/1               BPF_ALU/BPF_ALU64          0
  MOD     0x9              0/1               BPF_ALU/BPF_ALU64          0

The current 'code' field only has two value left, 0xe and 0xf.
gcc used these two values (0xe and 0xf) for SDIV and SMOD.
But using these two values takes up all 'code' space and makes
future extension hard.

Here, I propose to encode SDIV/SMOD like below:

  insn    code(4 bits)     source(1 bit)     instruction class(3 bit) 
off(16 bits)
  DIV     0x3              0/1               BPF_ALU/BPF_ALU64          1
  MOD     0x9              0/1               BPF_ALU/BPF_ALU64          1

Basically, we reuse the same 'code' value but changing 'off' from 0 to 1
to indicate signed div/mod.

Sign extend load
================

Currently llvm generated normal load instructions are encoded like below.

  mode(3 bits)      size(2 bits)    instruction class(3 bits)
  BPF_MEM (0x3)     8/16/32/64      BPF_LDX

For mode, existing used values are 0x0, 0x1, 0x2, 0x3, 0x6.
The proposal is to use mod value 0x4 to encode sign extend loads.

  alu32_mode  mode(3 bits)      size(2 bits)    instruction class(3 bits)
  yes         BPF_SMEM (0x4)    8/16            BPF_LDX
  no          BPF_SMEM (0x4)    8/16/32         BPF_LDX

Sign extend register mov
========================

Current BPF_MOV insn is encoded as
  insn    code(4 bits)     source(1 bit)     instruction class(3 bit) 
off(16 bits)
  MOV     0xb              0/1               BPF_ALU/BPF_ALU64          0

Let us support sign extended move insn as defined below:

  alu32_mode  insn    code(4 bits)    source(1 bit)    instruction 
class(3 bit)   off(16 bits)
  yes         MOVS    0xb             0/1              BPF_ALU 
          8/16
  no          MOVS    0xb             0/1              BPF_ALU64 
          8/16/32

In the above sign extended mov instruction, 'off' represents the 'size'.
For example, if alu32 mode is enabled, and 'off' is 8, which means sign 
extend a 8-bit
value (imm or register) to a 32-bit value. If alu32 mode is not enabled, 
the same 8-bit
value will sign extend to a 64-bit value.

32-bit JA
=========

Currently, the whole range of operations with BPF_JMP32/BPF_JMP insn are 
implemented like below

  ========  =====  =========================  ============
  code      value  description                notes
  ========  =====  =========================  ============
  BPF_JA    0x00   PC += off                  BPF_JMP only
  BPF_JEQ   0x10   PC += off if dst == src
  BPF_JGT   0x20   PC += off if dst > src     unsigned
  BPF_JGE   0x30   PC += off if dst >= src    unsigned
  BPF_JSET  0x40   PC += off if dst & src
  BPF_JNE   0x50   PC += off if dst != src
  BPF_JSGT  0x60   PC += off if dst > src     signed
  BPF_JSGE  0x70   PC += off if dst >= src    signed
  BPF_CALL  0x80   function call
  BPF_EXIT  0x90   function / program return  BPF_JMP only
  BPF_JLT   0xa0   PC += off if dst < src     unsigned
  BPF_JLE   0xb0   PC += off if dst <= src    unsigned
  BPF_JSLT  0xc0   PC += off if dst < src     signed
  BPF_JSLE  0xd0   PC += off if dst <= src    signed
  ========  =====  =========================  ============

Here the 'off' is 16 bit so the range of jump is [-32768, 32767].
In rare cases, people may have large programs or have loops fully unrolled.
This may cause some jump offset beyond the above range. In current
llvm implementation, wrong code (after truncation) will be generated.

To fix this issue, the following new insn is proposed

  ========  =====  =========================  ============
  code      value  description                notes
  ========  =====  =========================  ============
  BPF_JA    0x00   PC += imm                  BPF_JMP32 only, src = 1

The way, the jump offset range become [-2^31, 2^31 - 1].

For other jump instructions, e.g., BPF_JEQ, with a jmp offset
beyond [-32768, 32767]. It can be simulated with a
'BPF_JA (PC += imm)' followed by the original
BPF_JEQ with the range 'off', or BPF_JEQ with a short range followed
by a BPF_JA.

bswap16/32/64
=============

Currently, llvm does not generate bswap16/32/64 properly.
Rather it generates be16/32/64 and le16/32/64 instructions based on
endianness of the current bpf target in compilation.
The existing encode looks below:

  bpf target      insn    code(4 bits)     source(1 bit)
    instruction class(3 bit)   imm(32 bits)
  big endian      LE      0xd              LE(0)
    BPF_ALU                    16/32/64
  little endian   BE      0xd              BE(1)
    BPF_ALU                    16/32/64

LE insn will do swap if the running target is big endian.
BE insn will do swap if the running target is little endian.
See kernel/bpf/core.c for details.

The new bswap instruction will have the following encoding:
insn    code(4 bits)     source(1 bit)     instruction class(3 bit) 
imm(32 bits)
BSWAP   0xd              0                 BPF_ALU64 
16/32/64

The BSWAP insn will be swap unconditionally.

ST
==

The kernel has already supported BPF_ST insn like below,

  mode(3 bits)      size(2 bits)    instruction class(3 bits)
  BPF_MEM (0x3)     8/16/32/64      BPF_ST

The semantics is:
  *(size *) (dst_reg + off) = imm32
LLVM just needs to implement this instruction under -mcpu=v4. looks
like gcc can already generate this instruction.