On Wed, Jan 25, 2023 at 06:58:17PM +0000, dthaler1968@xxxxxxxxxxxxxx wrote: > From: Dave Thaler <dthaler@xxxxxxxxxxxxx> > > Use consistent names for the same field, e.g., 'dst' vs 'dst_reg'. > Previously a mix of terms were used for the same thing in various cases. > > Changes since last submission: addressed comments from Alexei and Stanislav In the future, if sending subsequent iterations of a patch, could you please follow the typical versioning and changelog convention described in [0]? [0]: https://www.kernel.org/doc/html/latest/process/submitting-patches.html > > Signed-off-by: Dave Thaler <dthaler@xxxxxxxxxxxxx> > --- > Documentation/bpf/instruction-set.rst | 105 ++++++++++++++++++-------- > 1 file changed, 74 insertions(+), 31 deletions(-) > > diff --git a/Documentation/bpf/instruction-set.rst b/Documentation/bpf/instruction-set.rst > index 2d3fe59bd26..3778c807cbb 100644 > --- a/Documentation/bpf/instruction-set.rst > +++ b/Documentation/bpf/instruction-set.rst > @@ -30,20 +30,59 @@ Instruction encoding > eBPF has two instruction encodings: > > * the basic instruction encoding, which uses 64 bits to encode an instruction > -* the wide instruction encoding, which appends a second 64-bit immediate value > - (imm64) after the basic instruction for a total of 128 bits. > +* the wide instruction encoding, which appends a second 64-bit immediate (i.e., > + constant) value after the basic instruction for a total of 128 bits. > > -The basic instruction encoding looks as follows: > +The basic instruction encoding is as follows, where MSB and LSB mean the most significant > +bits and least significant bits, respectively: > > ============= ======= =============== ==================== ============ > 32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB) > ============= ======= =============== ==================== ============ > -immediate offset source register destination register opcode > +imm offset src dst opcode What's the rationale for changing source register and destination register to src and dst respectively here? Below you clarify that they mean something other than register number after this section in the document, so why not just leave them as is here to avoid any confusion? > ============= ======= =============== ==================== ============ > > +imm Can we make all of these bold, just to slightly improve readability. E.g.: **imm** > + signed integer immediate value > + > +offset > + signed integer offset used with pointer arithmetic > + > +src > + the source register number (0-10), except where otherwise specified > + (`64-bit immediate instructions`_ reuse this field for other purposes) > + > +dst > + destination register number (0-10) > + > +opcode > + operation to perform > + > Note that most instructions do not use all of the fields. > Unused fields shall be cleared to zero. > > +As discussed below in `64-bit immediate instructions`_, a 64-bit immediate > +instruction uses a 64-bit immediate value that is constructed as follows. FWIW, I'd consider moving this description of how imm64 is encoded into the 64-bit immediate instructions section, as it only has relevance in that context anyways. What do you think? > +The 64 bits following the basic instruction contain a pseudo instruction > +using the same format but with opcode, dst, src, and offset all set to zero, > +and imm containing the high 32 bits of the immediate value. > + > +================= ================== > +64 bits (MSB) 64 bits (LSB) > +================= ================== > +basic instruction pseudo instruction > +================= ================== > + > +Thus the 64-bit immediate value is constructed as follows: > + > + imm64 = (next_imm << 32) | imm > + > +where 'next_imm' refers to the imm value of the pseudo instruction > +following the basic instruction. > + > +In the remainder of this document 'src' and 'dst' refer to the values of the source > +and destination registers, respectively, rather than the register number. > + > Instruction classes > ------------------- > > @@ -71,20 +110,24 @@ For arithmetic and jump instructions (``BPF_ALU``, ``BPF_ALU64``, ``BPF_JMP`` an > ============== ====== ================= > 4 bits (MSB) 1 bit 3 bits (LSB) > ============== ====== ================= > -operation code source instruction class > +code source instruction class > ============== ====== ================= > > -The 4th bit encodes the source operand: > +code > + the operation code, whose meaning varies by instruction class > > - ====== ===== ======================================== > - source value description > - ====== ===== ======================================== > - BPF_K 0x00 use 32-bit immediate as source operand > - BPF_X 0x08 use 'src_reg' register as source operand > - ====== ===== ======================================== > +source > + the source operand location, which unless otherwise specified is one of: > > -The four MSB bits store the operation code. > + ====== ===== ========================================== > + source value description > + ====== ===== ========================================== > + BPF_K 0x00 use 32-bit 'imm' value as source operand > + BPF_X 0x08 use 'src' register value as source operand > + ====== ===== ========================================== > > +instruction class > + the instruction class (see `Instruction classes`_) > > Arithmetic instructions > ----------------------- > @@ -121,19 +164,19 @@ the destination register is unchanged whereas for ``BPF_ALU`` the upper > > ``BPF_ADD | BPF_X | BPF_ALU`` means:: > > - dst_reg = (u32) dst_reg + (u32) src_reg; > + dst = (u32) ((u32) dst + (u32) src) > > ``BPF_ADD | BPF_X | BPF_ALU64`` means:: > > - dst_reg = dst_reg + src_reg > + dst = dst + src > > ``BPF_XOR | BPF_K | BPF_ALU`` means:: > > - dst_reg = (u32) dst_reg ^ (u32) imm32 > + dst = (u32) dst ^ (u32) imm32 > > ``BPF_XOR | BPF_K | BPF_ALU64`` means:: > > - dst_reg = dst_reg ^ imm32 > + dst = dst ^ imm32 > > Also note that the division and modulo operations are unsigned. Thus, for > ``BPF_ALU``, 'imm' is first interpreted as an unsigned 32-bit value, whereas > @@ -167,11 +210,11 @@ Examples: > > ``BPF_ALU | BPF_TO_LE | BPF_END`` with imm = 16 means:: > > - dst_reg = htole16(dst_reg) > + dst = htole16(dst) > > ``BPF_ALU | BPF_TO_BE | BPF_END`` with imm = 64 means:: > > - dst_reg = htobe64(dst_reg) > + dst = htobe64(dst) > > Jump instructions > ----------------- > @@ -246,15 +289,15 @@ instructions that transfer data between a register and memory. > > ``BPF_MEM | <size> | BPF_STX`` means:: > > - *(size *) (dst_reg + off) = src_reg > + *(size *) (dst + offset) = src_reg s/src_reg/src > > ``BPF_MEM | <size> | BPF_ST`` means:: > > - *(size *) (dst_reg + off) = imm32 > + *(size *) (dst + offset) = imm32 > > ``BPF_MEM | <size> | BPF_LDX`` means:: > > - dst_reg = *(size *) (src_reg + off) > + dst = *(size *) (src + offset) > > Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``. > > @@ -288,11 +331,11 @@ BPF_XOR 0xa0 atomic xor > > ``BPF_ATOMIC | BPF_W | BPF_STX`` with 'imm' = BPF_ADD means:: > > - *(u32 *)(dst_reg + off16) += src_reg > + *(u32 *)(dst + offset) += src > > ``BPF_ATOMIC | BPF_DW | BPF_STX`` with 'imm' = BPF ADD means:: > > - *(u64 *)(dst_reg + off16) += src_reg > + *(u64 *)(dst + offset) += src > > In addition to the simple atomic operations, there also is a modifier and > two complex atomic operations: > @@ -307,16 +350,16 @@ BPF_CMPXCHG 0xf0 | BPF_FETCH atomic compare and exchange > > The ``BPF_FETCH`` modifier is optional for simple atomic operations, and > always set for the complex atomic operations. If the ``BPF_FETCH`` flag > -is set, then the operation also overwrites ``src_reg`` with the value that > +is set, then the operation also overwrites ``src`` with the value that > was in memory before it was modified. > > -The ``BPF_XCHG`` operation atomically exchanges ``src_reg`` with the value > -addressed by ``dst_reg + off``. > +The ``BPF_XCHG`` operation atomically exchanges ``src`` with the value > +addressed by ``dst + offset``. > > The ``BPF_CMPXCHG`` operation atomically compares the value addressed by > -``dst_reg + off`` with ``R0``. If they match, the value addressed by > -``dst_reg + off`` is replaced with ``src_reg``. In either case, the > -value that was at ``dst_reg + off`` before the operation is zero-extended > +``dst + offset`` with ``R0``. If they match, the value addressed by > +``dst + offset`` is replaced with ``src``. In either case, the > +value that was at ``dst + offset`` before the operation is zero-extended > and loaded back to ``R0``. > > 64-bit immediate instructions > @@ -329,7 +372,7 @@ There is currently only one such instruction. > > ``BPF_LD | BPF_DW | BPF_IMM`` means:: > > - dst_reg = imm64 > + dst = imm64 > > > Legacy BPF Packet access instructions > -- > 2.33.4 >