> On 2/27/23 5:12 PM, Jose E. Marchesi wrote: >> [Changes from V3: >> - Back to src_reg and dst_reg, since they denote register numbers >> as opposed to the values stored in these registers.] >> [Changes from V2: >> - Use src and dst consistently in the document. >> - Use a more graphical depiction of the 128-bit instruction. >> - Remove `Where:' fragment. >> - Clarify that unused bits are reserved and shall be zeroed.] >> [Changes from V1: >> - Use rst literal blocks for figures. >> - Avoid using | in the basic instruction/pseudo instruction figure. >> - Rebased to today's bpf-next master branch.] >> This patch modifies instruction-set.rst so it documents the encoding >> of BPF instructions in terms of how the bytes are stored (be it in an >> ELF file or as bytes in a memory buffer to be loaded into the kernel >> or some other BPF consumer) as opposed to how the instruction looks >> like once loaded. >> This is hopefully easier to understand by implementors looking to >> generate and/or consume bytes conforming BPF instructions. >> The patch also clarifies that the unused bytes in a >> pseudo-instruction >> shall be cleared with zeros. >> Signed-off-by: Jose E. Marchesi <jose.marchesi@xxxxxxxxxx> >> --- >> Documentation/bpf/instruction-set.rst | 46 ++++++++++++++------------- >> 1 file changed, 24 insertions(+), 22 deletions(-) >> diff --git a/Documentation/bpf/instruction-set.rst >> b/Documentation/bpf/instruction-set.rst >> index 01802ed9b29b..f67a6677ae09 100644 >> --- a/Documentation/bpf/instruction-set.rst >> +++ b/Documentation/bpf/instruction-set.rst >> @@ -38,15 +38,11 @@ eBPF has two instruction encodings: >> * the wide instruction encoding, which appends a second 64-bit immediate (i.e., >> constant) value after the basic instruction for a total of 128 bits. >> -The basic instruction encoding looks as follows for a >> little-endian processor, >> -where MSB and LSB mean the most significant bits and least significant bits, >> -respectively: >> +The fields conforming an encoded basic instruction are stored in the >> +following order:: >> -============= ======= ======= ======= ============ >> -32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB) >> -============= ======= ======= ======= ============ >> -imm offset src_reg dst_reg opcode >> -============= ======= ======= ======= ============ >> + opcode:8 src_reg:4 dst_reg:4 offset:16 imm:32 // In little-endian BPF. >> + opcode:8 dst_reg:4 src_reg:4 offset:16 imm:32 // In big-endian BPF. >> **imm** >> signed integer immediate value >> @@ -64,16 +60,17 @@ imm offset src_reg dst_reg opcode >> **opcode** >> operation to perform >> -and as follows for a big-endian processor: >> +Note that the contents of multi-byte fields ('imm' and 'offset') are >> +stored using big-endian byte ordering in big-endian BPF and >> +little-endian byte ordering in little-endian BPF. >> -============= ======= ======= ======= ============ >> -32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB) >> -============= ======= ======= ======= ============ >> -imm offset dst_reg src_reg opcode >> -============= ======= ======= ======= ============ >> +For example:: >> -Multi-byte fields ('imm' and 'offset') are similarly stored in >> -the byte order of the processor. >> + opcode offset imm assembly >> + src_reg dst_reg >> + 07 0 1 00 00 44 33 22 11 r1 += 0x11223344 // little >> + dst_reg src_reg >> + 07 1 0 00 00 11 22 33 44 r1 += 0x11223344 // big >> Note that most instructions do not use all of the fields. >> Unused fields shall be cleared to zero. >> @@ -84,18 +81,23 @@ The 64 bits following the basic instruction contain a pseudo instruction >> using the same format but with opcode, dst_reg, src_reg, and offset all set to zero, >> and imm containing the high 32 bits of the immediate value. >> -================= ================== >> -64 bits (MSB) 64 bits (LSB) >> -================= ================== >> -basic instruction pseudo instruction >> -================= ================== >> +This is depicted in the following figure:: >> + >> + basic_instruction >> + .-----------------------------. >> + | | >> + code:8 regs:16 offset:16 imm:32 unused:32 imm:32 > > regs:16 -> regs:8 Thanks. Fixed in a V5. >> + | | >> + '--------------' >> + pseudo instruction >> Thus the 64-bit immediate value is constructed as follows: >> imm64 = (next_imm << 32) | imm >> where 'next_imm' refers to the imm value of the pseudo >> instruction >> -following the basic instruction. >> +following the basic instruction. The unused bytes in the pseudo >> +instruction are reserved and shall be cleared to zero. >> Instruction classes >> -------------------