On Fri, Jan 05, 2024 at 11:14:51AM +0800, Aoyang Fang wrote: Hi Aoyang, Thanks a lot for your contribution. I agree that we need to fix the document to be consistent, though I'm afraid that I think this patch goes in the wrong direction by making everything match the jump instruction class. More below. nit: Could you please update the patch subject to be more self-describing. For example, something like: Use consistent numerical widths in instructions.rst encodings > For example: > 1. 1.3.1 Arithmetic instructions use '8 bits length' encoding to > express the 'code' value, e.g., BPF_ADD=0x00, BPF_SUB=0x10, > BPF_MUL=0x20. However the length of the 'code' is 4 bits. On the > other hand, 1.3.3 Jump instructions use '4 bits length' encoding, > e.g., BPF_JEQ=0x1 and BPF_JGT=0x2. > 2. There are also many places that use '8 bits length' encoding to > express the corresponding contents, e.g., 1.4 Load and store > instructions, BPF_ABS=0x20, BPF_IND=0x40. However, the length of > 'mode modifier' is 3 bits. > > To summarize, the only place that has inconsistent encoding is Jump > instructions. After discussing with Dave, dthaler1968@xxxxxxxxxxxxxx, > we agree that the document should be more clear. > > Signed-off-by: Aoyang Fang <aoyangfang@xxxxxxxxxxxxxxxx> > > --- > .../bpf/standardization/instruction-set.rst | 170 +++++++++--------- > 1 file changed, 85 insertions(+), 85 deletions(-) > > diff --git a/Documentation/bpf/standardization/instruction-set.rst b/Documentation/bpf/standardization/instruction-set.rst > index 245b6defc..57dd1fa00 100644 > --- a/Documentation/bpf/standardization/instruction-set.rst > +++ b/Documentation/bpf/standardization/instruction-set.rst > @@ -172,18 +172,18 @@ Instruction classes > > The three LSB bits of the 'opcode' field store the instruction class: > > -========= ===== =============================== =================================== > -class value description reference > -========= ===== =============================== =================================== > -BPF_LD 0x00 non-standard load operations `Load and store instructions`_ > -BPF_LDX 0x01 load into register operations `Load and store instructions`_ > -BPF_ST 0x02 store from immediate operations `Load and store instructions`_ > -BPF_STX 0x03 store from register operations `Load and store instructions`_ > -BPF_ALU 0x04 32-bit arithmetic operations `Arithmetic and jump instructions`_ > -BPF_JMP 0x05 64-bit jump operations `Arithmetic and jump instructions`_ > -BPF_JMP32 0x06 32-bit jump operations `Arithmetic and jump instructions`_ > -BPF_ALU64 0x07 64-bit arithmetic operations `Arithmetic and jump instructions`_ > -========= ===== =============================== =================================== > +========= ============= =============================== =================================== > +class value(3 bits) description reference > +========= ============= =============================== =================================== > +BPF_LD 0x0 non-standard load operations `Load and store instructions`_ > +BPF_LDX 0x1 load into register operations `Load and store instructions`_ > +BPF_ST 0x2 store from immediate operations `Load and store instructions`_ > +BPF_STX 0x3 store from register operations `Load and store instructions`_ > +BPF_ALU 0x4 32-bit arithmetic operations `Arithmetic and jump instructions`_ > +BPF_JMP 0x5 64-bit jump operations `Arithmetic and jump instructions`_ > +BPF_JMP32 0x6 32-bit jump operations `Arithmetic and jump instructions`_ > +BPF_ALU64 0x7 64-bit arithmetic operations `Arithmetic and jump instructions`_ > +========= ============= =============================== =================================== Hmm, I presonally think this is more confusing. The opcode field is 8 bits. We already specify that the value is the three LSB of the opcode field. It's certainly subjective, but I think we should have the value reflect the actual value in the field it's embedded in. In my opinion, changing the value to not reflect its place in the actual opcode in my opinion imposes a burden on the reader to go back and reference where the field actually belongs in the full opcode. It's a tradeoff, but I think we're already on the winning end of that tradeoff. > Arithmetic and jump instructions > ================================ > @@ -203,12 +203,12 @@ code source instruction class > **source** > the source operand location, which unless otherwise specified is one of: > > - ====== ===== ============================================== > - source value description > - ====== ===== ============================================== > - BPF_K 0x00 use 32-bit 'imm' value as source operand > - BPF_X 0x08 use 'src_reg' register value as source operand > - ====== ===== ============================================== > + ====== ============ ============================================== > + source value(1 bit) description > + ====== ============ ============================================== > + BPF_K 0x0 use 32-bit 'imm' value as source operand > + BPF_X 0x1 use 'src_reg' register value as source operand > + ====== ============ ============================================== Same here as well. The value isn't really 0x1, it's 0x8. And 0x08 is even more clear yet, given that we're representing the value of the bit in the 8 bit opcode field. > **instruction class** > the instruction class (see `Instruction classes`_) > @@ -221,27 +221,27 @@ otherwise identical operations. > The 'code' field encodes the operation as below, where 'src' and 'dst' refer > to the values of the source and destination registers, respectively. > > -========= ===== ======= ========================================================== > -code value offset description > -========= ===== ======= ========================================================== > -BPF_ADD 0x00 0 dst += src > -BPF_SUB 0x10 0 dst -= src > -BPF_MUL 0x20 0 dst \*= src > -BPF_DIV 0x30 0 dst = (src != 0) ? (dst / src) : 0 > -BPF_SDIV 0x30 1 dst = (src != 0) ? (dst s/ src) : 0 > -BPF_OR 0x40 0 dst \|= src > -BPF_AND 0x50 0 dst &= src > -BPF_LSH 0x60 0 dst <<= (src & mask) > -BPF_RSH 0x70 0 dst >>= (src & mask) > -BPF_NEG 0x80 0 dst = -dst > -BPF_MOD 0x90 0 dst = (src != 0) ? (dst % src) : dst > -BPF_SMOD 0x90 1 dst = (src != 0) ? (dst s% src) : dst > -BPF_XOR 0xa0 0 dst ^= src > -BPF_MOV 0xb0 0 dst = src > -BPF_MOVSX 0xb0 8/16/32 dst = (s8,s16,s32)src > -BPF_ARSH 0xc0 0 :term:`sign extending<Sign Extend>` dst >>= (src & mask) > -BPF_END 0xd0 0 byte swap operations (see `Byte swap instructions`_ below) > > -========= ===== ======= ========================================================== > +========= ============= ======= ========================================================== > +code value(4 bits) offset description > +========= ============= ======= ========================================================== > +BPF_ADD 0x0 0 dst += src > +BPF_SUB 0x1 0 dst -= src > +BPF_MUL 0x2 0 dst \*= src > +BPF_DIV 0x3 0 dst = (src != 0) ? (dst / src) : 0 > +BPF_SDIV 0x3 1 dst = (src != 0) ? (dst s/ src) : 0 > +BPF_OR 0x4 0 dst \|= src > +BPF_AND 0x5 0 dst &= src > +BPF_LSH 0x6 0 dst <<= (src & mask) > +BPF_RSH 0x7 0 dst >>= (src & mask) > +BPF_NEG 0x8 0 dst = -dst > +BPF_MOD 0x9 0 dst = (src != 0) ? (dst % src) : dst > +BPF_SMOD 0x9 1 dst = (src != 0) ? (dst s% src) : dst > +BPF_XOR 0xa 0 dst ^= src > +BPF_MOV 0xb 0 dst = src > +BPF_MOVSX 0xb 8/16/32 dst = (s8,s16,s32)src > +BPF_ARSH 0xc 0 :term:`sign extending<Sign Extend>` dst >>= (src & mask) > +BPF_END 0xd 0 byte swap operations (see `Byte swap instructions`_ below) > +========= ============= ======= ========================================================== Same here. > Underflow and overflow are allowed during arithmetic operations, meaning > the 64-bit or 32-bit value will wrap. If BPF program execution would > @@ -314,13 +314,13 @@ select what byte order the operation converts from or to. For > ``BPF_ALU64``, the 1-bit source operand field in the opcode is reserved > and must be set to 0. > > -========= ========= ===== ================================================= > -class source value description > -========= ========= ===== ================================================= > -BPF_ALU BPF_TO_LE 0x00 convert between host byte order and little endian > -BPF_ALU BPF_TO_BE 0x08 convert between host byte order and big endian > -BPF_ALU64 Reserved 0x00 do byte swap unconditionally > -========= ========= ===== ================================================= > +========= ========= ============ ================================================= > +class source value(1 bit) description > +========= ========= ============ ================================================= > +BPF_ALU BPF_TO_LE 0x0 convert between host byte order and little endian > +BPF_ALU BPF_TO_BE 0x1 convert between host byte order and big endian > +BPF_ALU64 Reserved 0x0 do byte swap unconditionally > +========= ========= ============ ================================================= Same here. Which bit does the 0x1 actually correspond to? It's self-evident in the former, not the latter. > > The 'imm' field encodes the width of the swap operations. The following widths > are supported: 16, 32 and 64. > @@ -352,27 +352,27 @@ Jump instructions > otherwise identical operations. > The 'code' field encodes the operation as below: > > -======== ===== === =========================================== ========================================= > -code value src description notes > -======== ===== === =========================================== ========================================= > -BPF_JA 0x0 0x0 PC += offset BPF_JMP class > -BPF_JA 0x0 0x0 PC += imm BPF_JMP32 class > -BPF_JEQ 0x1 any PC += offset if dst == src > -BPF_JGT 0x2 any PC += offset if dst > src unsigned > -BPF_JGE 0x3 any PC += offset if dst >= src unsigned > -BPF_JSET 0x4 any PC += offset if dst & src > -BPF_JNE 0x5 any PC += offset if dst != src > -BPF_JSGT 0x6 any PC += offset if dst > src signed > -BPF_JSGE 0x7 any PC += offset if dst >= src signed > -BPF_CALL 0x8 0x0 call helper function by address see `Helper functions`_ > -BPF_CALL 0x8 0x1 call PC += imm see `Program-local functions`_ > -BPF_CALL 0x8 0x2 call helper function by BTF ID see `Helper functions`_ > -BPF_EXIT 0x9 0x0 return BPF_JMP only > -BPF_JLT 0xa any PC += offset if dst < src unsigned > -BPF_JLE 0xb any PC += offset if dst <= src unsigned > -BPF_JSLT 0xc any PC += offset if dst < src signed > -BPF_JSLE 0xd any PC += offset if dst <= src signed > -======== ===== === =========================================== ========================================= > +======== ============= === =========================================== ========================================= > +code value(4 bits) src description notes > +======== ============= === =========================================== ========================================= > +BPF_JA 0x0 0x0 PC += offset BPF_JMP class > +BPF_JA 0x0 0x0 PC += imm BPF_JMP32 class > +BPF_JEQ 0x1 any PC += offset if dst == src > +BPF_JGT 0x2 any PC += offset if dst > src unsigned > +BPF_JGE 0x3 any PC += offset if dst >= src unsigned > +BPF_JSET 0x4 any PC += offset if dst & src > +BPF_JNE 0x5 any PC += offset if dst != src > +BPF_JSGT 0x6 any PC += offset if dst > src signed > +BPF_JSGE 0x7 any PC += offset if dst >= src signed > +BPF_CALL 0x8 0x0 call helper function by address see `Helper functions`_ > +BPF_CALL 0x8 0x1 call PC += imm see `Program-local functions`_ > +BPF_CALL 0x8 0x2 call helper function by BTF ID see `Helper functions`_ > +BPF_EXIT 0x9 0x0 return BPF_JMP only > +BPF_JLT 0xa any PC += offset if dst < src unsigned > +BPF_JLE 0xb any PC += offset if dst <= src unsigned > +BPF_JSLT 0xc any PC += offset if dst < src signed > +BPF_JSLE 0xd any PC += offset if dst <= src signed > +======== ============= === =========================================== ========================================= Good catch, this definitely needed to be fixed. I personally would vote to update it to match the other instruction classes, rather than updating the other ones to match jump instructions. What do you think? Thanks, David
Attachment:
signature.asc
Description: PGP signature