On Thu, Jul 06, 2023 at 04:05:37PM +0000, Dave Thaler wrote: > From: Dave Thaler <dthaler@xxxxxxxxxxxxx> > > Signed-off-by: Dave Thaler <dthaler@xxxxxxxxxxxxx> > -- > V1 -> V2: addressed comments from Alexei > --- > Documentation/bpf/instruction-set.rst | 59 ++++++++++++++++++++------- > Documentation/bpf/linux-notes.rst | 5 +++ > 2 files changed, 50 insertions(+), 14 deletions(-) > > diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst > index 751e657973f..740989f4c1e 100644 > --- a/Documentation/bpf/instruction-set.rst > +++ b/Documentation/bpf/instruction-set.rst > @@ -7,6 +7,9 @@ eBPF Instruction Set Specification, v1.0 > > This document specifies version 1.0 of the eBPF instruction set. > > +The eBPF instruction set consists of eleven 64 bit registers, a program counter, > +and an implementation-specific amount (e.g., 512 bytes) of stack space. > + > Documentation conventions > ========================= > > @@ -27,12 +30,24 @@ The eBPF calling convention is defined as: > * R6 - R9: callee saved registers that function calls will preserve > * R10: read-only frame pointer to access stack > > -R0 - R5 are scratch registers and eBPF programs needs to spill/fill them if > -necessary across calls. > +Registers R0 - R5 are caller-saved registers, meaning the BPF program needs to either > +spill them to the BPF stack or move them to callee saved registers if these > +arguments are to be reused across multiple function calls. Spilling means > +that the value in the register is moved to the BPF stack. The reverse operation > +of moving the variable from the BPF stack to the register is called filling. > +The reason for spilling/filling is due to the limited number of registers. imo this extended explanation goes too far. It's also not entirely correct. We could have an ISA with limited number of registers where every register is callee saved. A bit absurd, but possible. Or went with SPARC style register windows. > + > +Upon entering execution of an eBPF program, registers R1 - R5 initially can contain > +the input arguments for the program (similar to the argc/argv pair for a typical C program). argc/argv is only for main(). We don't have main() concept in BPF ISA. argc/argv is also not a property of ISA. > +The actual number of registers used, and their meaning, is defined by the program type; > +for example, a networking program might have an argument that includes network packet data > +and/or metadata. that makes things even more confusing. tbh none of the above changes make the doc easier to read. > Instruction encoding > ==================== > > +An eBPF program is a sequence of instructions. Kinda true, but it opens the door for plenty of bike shedding. Is it contiguous sequence? what about subprograms? Is BPF program a one function or multiple functions? etc. Just not worth it. This is ISA doc. > + > eBPF has two instruction encodings: > > * the basic instruction encoding, which uses 64 bits to encode an instruction > @@ -74,7 +89,7 @@ For example:: > 07 1 0 00 00 11 22 33 44 r1 += 0x11223344 // big > > Note that most instructions do not use all of the fields. > -Unused fields shall be cleared to zero. > +Unused fields must be set to zero. How is this better? > > As discussed below in `64-bit immediate instructions`_, a 64-bit immediate > instruction uses a 64-bit immediate value that is constructed as follows. > @@ -103,7 +118,9 @@ instruction are reserved and shall be cleared to zero. > Instruction classes > ------------------- > > -The three LSB bits of the 'opcode' field store the instruction class: > +The encoding of the 'opcode' field varies and can be determined from > +the three least significant bits (LSB) of the 'opcode' field which holds > +the "instruction class", as follows: same question. Don't see an improvement in wording. > > ========= ===== =============================== =================================== > class value description reference > @@ -149,7 +166,8 @@ code source instruction class > Arithmetic instructions > ----------------------- > > -``BPF_ALU`` uses 32-bit wide operands while ``BPF_ALU64`` uses 64-bit wide operands for > +Instruction class ``BPF_ALU`` uses 32-bit wide operands (zeroing the upper 32 bits > +of the destination register) while ``BPF_ALU64`` uses 64-bit wide operands for The other part of the doc mentions zeroing. No need to repeat. > otherwise identical operations. > The 'code' field encodes the operation as below, where 'src' and 'dst' refer > to the values of the source and destination registers, respectively. > @@ -216,8 +234,9 @@ The byte swap instructions use an instruction class of ``BPF_ALU`` and a 4-bit > The byte swap instructions operate on the destination register > only and do not use a separate source register or immediate value. > > -The 1-bit source operand field in the opcode is used to select what byte > -order the operation convert from or to: > +Byte swap instructions use the 1-bit 'source' field in the 'opcode' field > +as follows. Instead of indicating the source operator, it is instead > +used to select what byte order the operation converts from or to: +1 to this part. > > ========= ===== ================================================= > source value description > @@ -235,16 +254,21 @@ Examples: > > dst = htole16(dst) > > +where 'htole16()' indicates converting a 16-bit value from host byte order to little-endian byte order. > + > ``BPF_ALU | BPF_TO_BE | BPF_END`` with imm = 64 means:: > > dst = htobe64(dst) > > +where 'htobe64()' indicates converting a 64-bit value from host byte order to big-endian byte order. > + > Jump instructions > ----------------- > > -``BPF_JMP32`` uses 32-bit wide operands while ``BPF_JMP`` uses 64-bit wide operands for > +Instruction class ``BPF_JMP32`` uses 32-bit wide operands while ``BPF_JMP`` uses 64-bit wide operands for > otherwise identical operations. > -The 'code' field encodes the operation as below: > + > +The 4-bit 'code' field encodes the operation as below, where PC is the program counter: > > ======== ===== === =========================================== ========================================= > code value src description notes > @@ -311,7 +335,8 @@ For load and store instructions (``BPF_LD``, ``BPF_LDX``, ``BPF_ST``, and ``BPF_ > mode size instruction class > ============ ====== ================= > > -The mode modifier is one of: > +mode > + one of: > > ============= ===== ==================================== ============= > mode modifier value description reference > @@ -323,7 +348,8 @@ The mode modifier is one of: > BPF_ATOMIC 0xc0 atomic operations `Atomic operations`_ > ============= ===== ==================================== ============= > > -The size modifier is one of: > +size > + one of: > > ============= ===== ===================== > size modifier value description > @@ -334,6 +360,9 @@ The size modifier is one of: > BPF_DW 0x18 double word (8 bytes) > ============= ===== ===================== > > +instruction class > + the instruction class (see `Instruction classes`_) > + > Regular load and store operations > --------------------------------- > > @@ -352,7 +381,7 @@ instructions that transfer data between a register and memory. > > dst = *(size *) (src + offset) > > -Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``. > +where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``. > > Atomic operations > ----------------- > @@ -366,7 +395,9 @@ that use the ``BPF_ATOMIC`` mode modifier as follows: > > * ``BPF_ATOMIC | BPF_W | BPF_STX`` for 32-bit operations > * ``BPF_ATOMIC | BPF_DW | BPF_STX`` for 64-bit operations > -* 8-bit and 16-bit wide atomic operations are not supported. > + > +Note that 8-bit (``BPF_B``) and 16-bit (``BPF_H``) wide atomic operations are not currently supported, > +nor is ``BPF_ATOMIC | <size> | BPF_ST``. > > The 'imm' field is used to encode the actual atomic operation. > Simple atomic operation use a subset of the values defined to encode > @@ -390,7 +421,7 @@ BPF_XOR 0xa0 atomic xor > > *(u64 *)(dst + offset) += src > > -In addition to the simple atomic operations, there also is a modifier and > +In addition to the simple atomic operations above, there also is a modifier and > two complex atomic operations: > > =========== ================ =========================== > diff --git a/Documentation/bpf/linux-notes.rst b/Documentation/bpf/linux-notes.rst > index 508d009d3be..724579fd62d 100644 > --- a/Documentation/bpf/linux-notes.rst > +++ b/Documentation/bpf/linux-notes.rst > @@ -7,6 +7,11 @@ Linux implementation notes > > This document provides more details specific to the Linux kernel implementation of the eBPF instruction set. > > +Stack space > +====================== > + > +Linux currently supports 512 bytes of stack space. I wouldn't open this door. The verifier enforces 512 stack space for bpf prog plus all of its subprogs that it calls directly. There is no good way to describe it concisely. And such description doesn't belong in ISA doc.