RE: 64-bit immediate instructions clarification

dthaler1968@xxxxxxxxxxxxxx · Fri, 26 Jan 2024 14:27:10 -0800

Yonghong Song <yonghong.song@xxxxxxxxx> wrote: 
> On 1/25/24 5:12 PM, dthaler1968@xxxxxxxxxxxxxx wrote:
> > The spec defines:
> >> As discussed below in `64-bit immediate instructions`_, a 64-bit
> >> immediate instruction uses a 64-bit immediate value that is constructed as
> follows.
> >> The 64 bits following the basic instruction contain a pseudo
> >> instruction using the same format but with opcode, dst_reg, src_reg,
> >> and offset all set to zero, and imm containing the high 32 bits of the
> immediate value.
> > [...]
> >> imm64 = (next_imm << 32) | imm
> > The 64-bit immediate instructions section then says:
> >> Instructions with the ``BPF_IMM`` 'mode' modifier use the wide
> >> instruction encoding defined in `Instruction encoding`_, and use the
> >> 'src' field of the basic instruction to hold an opcode subtype.
> > Some instructions then nicely state how to use the full 64 bit
> > immediate value, such as
> >> BPF_IMM | BPF_DW | BPF_LD  0x18    0x0  dst = imm64
> integer      integer
> >> BPF_IMM | BPF_DW | BPF_LD  0x18    0x2  dst = map_val(map_by_fd(imm))
> + next_imm   map fd       data pointer
> >> BPF_IMM | BPF_DW | BPF_LD  0x18    0x6  dst = map_val(map_by_idx(imm))
> + next_imm  map index    data pointer
> > Others don't:
> >> BPF_IMM | BPF_DW | BPF_LD  0x18    0x1  dst = map_by_fd(imm)
> map fd       map
> >> BPF_IMM | BPF_DW | BPF_LD  0x18    0x3  dst = var_addr(imm)
> variable id  data pointer
> >> BPF_IMM | BPF_DW | BPF_LD  0x18    0x4  dst = code_addr(imm)
> integer      code pointer
> >> BPF_IMM | BPF_DW | BPF_LD  0x18    0x5  dst = map_by_idx(imm)
> map index    map
> > How is next_imm used in those four?  Must it be 0?  Or can it be anything and
> it's ignored?
> > Or is it used for something?
> 
> The other four must have next_imm to be 0. No use of next_imm in thee four
> insns kindly implies this.
> See uapi bpf.h for details (search BPF_PSEUDO_MAP_FD).

Thanks for confirming.  The "Instruction encoding" section has misleading text
in my opinion.

It nicely says:
> Note that most instructions do not use all of the fields. Unused fields shall be cleared to zero.

But then goes on to say:
> As discussed below in 64-bit immediate instructions (Section 4.4), a 64-bit immediate instruction
> uses a 64-bit immediate value that is constructed as follows.
[...]
> imm64 = (next_imm << 32) | imm

Under a normal English reading, that could imply that all 64-bit immediate instructions use imm64,
which is not the case.  The whole imm64 discussion there only applies today to src=0 (though I
suppose it could be used by future 64-bit immediate instructions).   Minimally I think
"a 64-bit immediate instruction uses" should be "some 64-bit immediate instructions use"
but at present there's only one.

It would actually be simpler to remove the imm64 text and just have the
definition of src 0x0 change from: "dst = imm64" to "dst = (next_imm << 32) | imm".

What do you think?

Dave