Re: [PATCH 11/15] ebpf-docs: Improve English readability

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Tue, 4 Oct 2022 08:38:04 -0700

On Tue, Oct 4, 2022 at 7:32 AM Dave Thaler <dthaler@xxxxxxxxxxxxx> wrote:
>
> Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote:
> > > +The eBPF instruction set consists of eleven 64 bit registers, a
> > > +program counter, and 512 bytes of stack space.
> >
> > I would not add 512 to a doc.
> > That's what we have now, but we might relax that in the future.
>
> I think it's important to at least give a minimum.  Will change to
> "at least 512".

I'm not sure 512 stands as a minimum either.
When we have the gatekeeper prog it will be able to enforce
any stack size.

> [...]
> > > +Registers R0 - R5 are scratch registers, meaning the BPF program
> > > +needs to either spill them to the BPF stack or move them to callee
> > > +saved registers if these arguments are to be reused across multiple
> > > +function calls. Spilling means that the value in the register is
> > > +moved to the BPF stack. The reverse operation of moving the variable
> > from the BPF stack to the register is called filling.
> > > +The reason for spilling/filling is due to the limited number of registers.
> >
> > More canonical way would be to say that r0-r5 are caller saved.
>
> Will change "scratch" to "caller-saved"
>
> [...]
> > > -``BPF_ADD | BPF_X | BPF_ALU64`` means::
> > > +``BPF_ADD | BPF_X | BPF_ALU64`` (0x0f) means::
> >
> > I don't think adding hex values help here.
>
> I found it very helpful in verifying that the Appendix table
> was correct, and providing a correlation to the text here
> that shows the construction of the value.  So I'd like to keep them.

I think that means that the appendix table shouldn't be there either.
I'd like to avoid both.

> [...]
> > > -The 1-bit source operand field in the opcode is used to to select
> > > what byte -order the operation convert from or to:
> > > +Byte swap instructions use non-default semantics of the 1-bit
> > > +'source' field in
> >
> > I would drop 'non-default'. Many fields have different meanings depending
> > on opcode.
> > BPF_SRC() macro just reads that bit.
>
> Will reword to say:
> Byte swap instructions use the 1-bit 'source' field in the 'opcode' field
> as follows.  Instead of indicating the source operator, it is instead
> used to select what byte order the operation converts from or to:

makes sense.

> [...]
> > > +* mnenomic indicates a short form that might be displayed by some
> > > +tools such as disassemblers
> > > +* 'htoleNN()' indicates converting a NN-bit value from host byte
> > > +order to little-endian byte order
> > > +* 'htobeNN()' indicates converting a NN-bit value from host byte
> > > +order to big-endian byte order
> >
> > I think we need to add normal bswap insn.
> > These to_le and to_be are very awkward to use.
> > As soon as we have new insn the compiler will be using it.
> > There is no equivalent of to_be and to_le in C. It wasn't good ISA design.
>
> I will interpret this as a request for someone to do code work, rather
> than any request for immediately changes to the doc :)
>
> [...]
> > >  Regular load and store operations
> > >  ---------------------------------
> > >
> > >  The ``BPF_MEM`` mode modifier is used to encode regular load and
> > > store  instructions that transfer data between a register and memory.
> > >
> > > -``BPF_MEM | <size> | BPF_STX`` means::
> > > -
> > > -  *(size *) (dst + offset) = src_reg
> > > -
> > > -``BPF_MEM | <size> | BPF_ST`` means::
> > > -
> > > -  *(size *) (dst + offset) = imm32
> > > -
> > > -``BPF_MEM | <size> | BPF_LDX`` means::
> > > -
> > > -  dst = *(size *) (src + offset)
> > > -
> > > -Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``.
> > > +=============================  =========
> > ====================================
> > > +opcode construction            opcode     pseudocode
> > > +=============================  =========
> > ====================================
> > > +BPF_MEM | BPF_B | BPF_LDX      0x71       dst = \*(uint8_t \*) (src + offset)
> > > +BPF_MEM | BPF_H | BPF_LDX      0x69       dst = \*(uint16_t \*) (src +
> > offset)
> > > +BPF_MEM | BPF_W | BPF_LDX      0x61       dst = \*(uint32_t \*) (src +
> > offset)
> > > +BPF_MEM | BPF_DW | BPF_LDX     0x79       dst = \*(uint64_t \*) (src +
> > offset)
> > > +BPF_MEM | BPF_B | BPF_ST       0x72       \*(uint8_t \*) (dst + offset) = imm
> > > +BPF_MEM | BPF_H | BPF_ST       0x6a       \*(uint16_t \*) (dst + offset) =
> > imm
> > > +BPF_MEM | BPF_W | BPF_ST       0x62       \*(uint32_t \*) (dst + offset) =
> > imm
> > > +BPF_MEM | BPF_DW | BPF_ST      0x7a       \*(uint64_t \*) (dst + offset) =
> > imm
> > > +BPF_MEM | BPF_B | BPF_STX      0x73       \*(uint8_t \*) (dst + offset) = src
> > > +BPF_MEM | BPF_H | BPF_STX      0x6b       \*(uint16_t \*) (dst + offset) =
> > src
> > > +BPF_MEM | BPF_W | BPF_STX      0x63       \*(uint32_t \*) (dst + offset) =
> > src
> > > +BPF_MEM | BPF_DW | BPF_STX     0x7b       \*(uint64_t \*) (dst + offset) =
> > src
> > > +=============================  =========
> > > +====================================
> >
> > I think the table is more verbose and less readable than the original text.
>
> Will change back to original text.

Please see git. I've removed that table. Please don't add it back.
I see no value in such tables other than more things to get wrong.