David Vernet <void@xxxxxxxxxxxxx> writes: > Hi Aoyang, > > Thanks a lot for your contribution. I agree that we need to fix the document > to be consistent, though I'm afraid that I think this patch goes in the wrong > direction by making everything match the jump instruction class. More below. I disagree, and I agree with Aoyang's direction. > nit: Could you please update the patch subject to be more self-describing. For > example, something like: > > Use consistent numerical widths in instructions.rst encodings I agree with that subject. > > For example: > > 1. 1.3.1 Arithmetic instructions use '8 bits length' encoding to > > express the 'code' value, e.g., BPF_ADD=0x00, BPF_SUB=0x10, > > BPF_MUL=0x20. However the length of the 'code' is 4 bits. On the > > other hand, 1.3.3 Jump instructions use '4 bits length' encoding, > > e.g., BPF_JEQ=0x1 and BPF_JGT=0x2. > > 2. There are also many places that use '8 bits length' encoding to > > express the corresponding contents, e.g., 1.4 Load and store > > instructions, BPF_ABS=0x20, BPF_IND=0x40. However, the length of > > 'mode modifier' is 3 bits. > > > > To summarize, the only place that has inconsistent encoding is Jump > > instructions. After discussing with Dave, dthaler1968@xxxxxxxxxxxxxx, > > we agree that the document should be more clear. > > > > Signed-off-by: Aoyang Fang <aoyangfang@xxxxxxxxxxxxxxxx> > > > > --- > > .../bpf/standardization/instruction-set.rst | 170 +++++++++--------- > > 1 file changed, 85 insertions(+), 85 deletions(-) > > > > diff --git a/Documentation/bpf/standardization/instruction-set.rst > > b/Documentation/bpf/standardization/instruction-set.rst > > index 245b6defc..57dd1fa00 100644 > > --- a/Documentation/bpf/standardization/instruction-set.rst > > +++ b/Documentation/bpf/standardization/instruction-set.rst > > @@ -172,18 +172,18 @@ Instruction classes > > > > The three LSB bits of the 'opcode' field store the instruction class: > > > > -========= ===== =============================== > =================================== > > -class value description reference > > -========= ===== =============================== > =================================== > > -BPF_LD 0x00 non-standard load operations `Load and store > instructions`_ > > -BPF_LDX 0x01 load into register operations `Load and store > instructions`_ > > -BPF_ST 0x02 store from immediate operations `Load and store > instructions`_ > > -BPF_STX 0x03 store from register operations `Load and store > instructions`_ > > -BPF_ALU 0x04 32-bit arithmetic operations `Arithmetic and jump > instructions`_ > > -BPF_JMP 0x05 64-bit jump operations `Arithmetic and jump > instructions`_ > > -BPF_JMP32 0x06 32-bit jump operations `Arithmetic and jump > instructions`_ > > -BPF_ALU64 0x07 64-bit arithmetic operations `Arithmetic and jump > instructions`_ > > -========= ===== =============================== > > =================================== > > +========= ============= =============================== > =================================== > > +class value(3 bits) description reference > > +========= ============= =============================== > =================================== > > +BPF_LD 0x0 non-standard load operations `Load and store > instructions`_ > > +BPF_LDX 0x1 load into register operations `Load and store > instructions`_ > > +BPF_ST 0x2 store from immediate operations `Load and store > instructions`_ > > +BPF_STX 0x3 store from register operations `Load and store > instructions`_ > > +BPF_ALU 0x4 32-bit arithmetic operations `Arithmetic and jump > instructions`_ > > +BPF_JMP 0x5 64-bit jump operations `Arithmetic and jump > instructions`_ > > +BPF_JMP32 0x6 32-bit jump operations `Arithmetic and jump > instructions`_ > > +BPF_ALU64 0x7 64-bit arithmetic operations `Arithmetic and jump > instructions`_ > > +========= ============= =============================== > > +=================================== > > Hmm, I presonally think this is more confusing. The opcode field is 8 bits. We > already specify that the value is the three LSB of the opcode field. It's > certainly subjective, but I think we should have the value reflect the actual > value in the field it's embedded in. In my opinion, changing the value to not > reflect its place in the actual opcode in my opinion imposes a burden on the > reader to go back and reference where the field actually belongs in the full > opcode. It's a tradeoff, but I think we're already on the winning end of that > tradeoff. This document is an IETF standards specification so it's worth looking at what typical RFC conventions are. * RFC 791 section 3.1 defines the IPv4 header, where the Version field is the high 4 bits of a byte. It defines the value as 4, not 0x40. It also defines the Type of Service bits which are 1 bit fields with value 0 or 1 (not, say 0x40). * RFC 8200 section 3 defines the IPv6 header, where the Version field is the high 4 bits of a byte. It defines the value as 6, not 0x60. Etc. Offhand I am not aware of any RFC that uses the convention you suggest, though perhaps others are? > > Arithmetic and jump instructions > > ================================ > > @@ -203,12 +203,12 @@ code source instruction class > > **source** > > the source operand location, which unless otherwise specified is one of: > > > > - ====== ===== ============================================== > > - source value description > > - ====== ===== ============================================== > > - BPF_K 0x00 use 32-bit 'imm' value as source operand > > - BPF_X 0x08 use 'src_reg' register value as source operand > > - ====== ===== ============================================== > > + ====== ============ > > + ============================================== > > + source value(1 bit) description > > + ====== ============ > ============================================== > > + BPF_K 0x0 use 32-bit 'imm' value as source operand > > + BPF_X 0x1 use 'src_reg' register value as source operand > > + ====== ============ > > + ============================================== > > Same here as well. The value isn't really 0x1, it's 0x8. And 0x08 is even more > clear yet, given that we're representing the value of the bit in the 8 bit opcode > field. Its 1, in the same sense as the TOS bits in RFC 791 are 1. > > **instruction class** > > the instruction class (see `Instruction classes`_) @@ -221,27 > > +221,27 @@ otherwise identical operations. > > The 'code' field encodes the operation as below, where 'src' and > > 'dst' refer to the values of the source and destination registers, respectively. > > > > -========= ===== ======= > ========================================================== > > -code value offset description > > -========= ===== ======= > ========================================================== > > -BPF_ADD 0x00 0 dst += src > > -BPF_SUB 0x10 0 dst -= src > > -BPF_MUL 0x20 0 dst \*= src > > -BPF_DIV 0x30 0 dst = (src != 0) ? (dst / src) : 0 > > -BPF_SDIV 0x30 1 dst = (src != 0) ? (dst s/ src) : 0 > > -BPF_OR 0x40 0 dst \|= src > > -BPF_AND 0x50 0 dst &= src > > -BPF_LSH 0x60 0 dst <<= (src & mask) > > -BPF_RSH 0x70 0 dst >>= (src & mask) > > -BPF_NEG 0x80 0 dst = -dst > > -BPF_MOD 0x90 0 dst = (src != 0) ? (dst % src) : dst > > -BPF_SMOD 0x90 1 dst = (src != 0) ? (dst s% src) : dst > > -BPF_XOR 0xa0 0 dst ^= src > > -BPF_MOV 0xb0 0 dst = src > > -BPF_MOVSX 0xb0 8/16/32 dst = (s8,s16,s32)src > > -BPF_ARSH 0xc0 0 :term:`sign extending<Sign Extend>` dst >>= (src & > mask) > > -BPF_END 0xd0 0 byte swap operations (see `Byte swap instructions`_ > below) > > > > -========= ===== ======= > > ========================================================== > > +========= ============= ======= > ========================================================== > > +code value(4 bits) offset description > > +========= ============= ======= > ========================================================== > > +BPF_ADD 0x0 0 dst += src > > +BPF_SUB 0x1 0 dst -= src > > +BPF_MUL 0x2 0 dst \*= src > > +BPF_DIV 0x3 0 dst = (src != 0) ? (dst / src) : 0 > > +BPF_SDIV 0x3 1 dst = (src != 0) ? (dst s/ src) : 0 > > +BPF_OR 0x4 0 dst \|= src > > +BPF_AND 0x5 0 dst &= src > > +BPF_LSH 0x6 0 dst <<= (src & mask) > > +BPF_RSH 0x7 0 dst >>= (src & mask) > > +BPF_NEG 0x8 0 dst = -dst > > +BPF_MOD 0x9 0 dst = (src != 0) ? (dst % src) : dst > > +BPF_SMOD 0x9 1 dst = (src != 0) ? (dst s% src) : dst > > +BPF_XOR 0xa 0 dst ^= src > > +BPF_MOV 0xb 0 dst = src > > +BPF_MOVSX 0xb 8/16/32 dst = (s8,s16,s32)src > > +BPF_ARSH 0xc 0 :term:`sign extending<Sign Extend>` dst >>= > (src & mask) > > +BPF_END 0xd 0 byte swap operations (see `Byte swap > instructions`_ below) > > +========= ============= ======= > > +========================================================== > > Same here. > > > Underflow and overflow are allowed during arithmetic operations, > > meaning the 64-bit or 32-bit value will wrap. If BPF program > > execution would @@ -314,13 +314,13 @@ select what byte order the > > operation converts from or to. For ``BPF_ALU64``, the 1-bit source > > operand field in the opcode is reserved and must be set to 0. > > > > -========= ========= ===== > ================================================= > > -class source value description > > -========= ========= ===== > ================================================= > > -BPF_ALU BPF_TO_LE 0x00 convert between host byte order and little > endian > > -BPF_ALU BPF_TO_BE 0x08 convert between host byte order and big > endian > > -BPF_ALU64 Reserved 0x00 do byte swap unconditionally > > -========= ========= ===== > > ================================================= > > +========= ========= ============ > ================================================= > > +class source value(1 bit) description > > +========= ========= ============ > ================================================= > > +BPF_ALU BPF_TO_LE 0x0 convert between host byte order and little > endian > > +BPF_ALU BPF_TO_BE 0x1 convert between host byte order and big > endian > > +BPF_ALU64 Reserved 0x0 do byte swap unconditionally > > +========= ========= ============ > > +================================================= > > Same here. Which bit does the 0x1 actually correspond to? It's self-evident in > the former, not the latter. Would you then say that RFC 791 (and many RFCs since) is not self-evident? If the WG chooses to diverge from the most common ways the IETF defines bit formats, that might be ok but may need a section explaining the divergent convention. My personal preference though is to stay consistent with the normal IETF convention, which part of the ISA doc already did. Dave