On Tue, Dec 19, 2023 at 10:10 AM <dthaler1968@xxxxxxxxxxxxxx> wrote: > > > -----Original Message----- > > From: Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> > > Sent: Monday, December 18, 2023 5:15 PM > > To: Christoph Hellwig <hch@xxxxxxxxxxxxx> > > Cc: David Vernet <void@xxxxxxxxxxxxx>; Dave Thaler > > <dthaler1968@xxxxxxxxxxxxxx>; bpf@xxxxxxxx; bpf <bpf@xxxxxxxxxxxxxxx>; > > Jakub Kicinski <kuba@xxxxxxxxxx> > > Subject: Re: [Bpf] BPF ISA conformance groups > > > > On Thu, Dec 14, 2023 at 9:29 PM Christoph Hellwig <hch@xxxxxxxxxxxxx> > > wrote: > > > > > > We need the concept in the spec just to allow future extensability. > > > > Completely agree that the concept of the groups is necessary. > > > > I'm arguing that what was proposed: > > 1. "basic": all instructions not covered by another group below. > > 2. "atomic": all Atomic operations. > > 3. "divide": all division and modulo operations. > > 4. "legacy": all legacy packet access instructions (deprecated). > > 5. "map": 64-bit immediate instructions that deal with map fds or map > > indices. > > 6. "code": 64-bit immediate instruction that has a "code pointer" type. > > 7. "func": program-local functions. > > > > logically makes sense, but might not work for HW (based on the history of nfp > > offload). > > imo "basic" and "legacy" won't work either. > > So it's a lesser evil. > > > > Anyway, let's look at: > > > > | BPF_CALL | 0x8 | 0x0 | call helper | see Helper | > > | | | | function by address | functions | > > | | | | | (Section 3.3.1) | > > +----------+-------+-----+---------------------+-------------------+ > > | BPF_CALL | 0x8 | 0x1 | call PC += imm | see Program-local | > > | | | | | functions | > > | | | | | (Section 3.3.2) | > > +----------+-------+-----+---------------------+-------------------+ > > | BPF_CALL | 0x8 | 0x2 | call helper | see Helper | > > | | | | function by BTF ID | functions | > > | | | | | (Section 3.3. > > > > Having separate category 7 for single insn BPF_CALL 0x8 0x1 while keeping 0x8 > > 0x0 and 0x8 0x2 in "basic" seems just as logical as having atomic_add insn in > > "basic" instead of "atomic". > > If a platform exposes no helper functions, then 0x8 0x0 and 0x8 0x2 have no > meaning and in my view don't need a separate conformance group since a > program using them would fail the verifier anyway. Right, but bringing the verifier into the "compliance picture" makes the ISA standard incomplete. Same can be said about nfp compliance. It's compliant with an ISA, but the verifier will reject things it doesn't support. The instruction groups need to be binary from compliance pov without external input. I think if we move one call 8 1 into a separate group we better move all call flavors into the same group or have 3 groups for 3 different flavors of calls. > 0x8 0x1 on the other hand wouldn't be invalid just due to the imm value, > and so tools (compiler, verifier, whatever) need some other way to know whether > it's supported, hence the conformance group. > > > Then we have several kinds of ld_imm64. Sounds like the idea is to split 0x18 > > 0x4 into "code" and the rest into "map" group? > > Is it logical or not? > > I don't know of another easy way for a tool like a compiler (LLVM, gcc, rust compiler, > etc.) to know whether map instructions are legal or not. > > That said, I think map_val() is problematic for a cross-platform compiler... > https://elixir.bootlin.com/linux/latest/source/Documentation/bpf/linux-notes.rst says > "Linux only supports the 'map_val(map)' operation on array maps with a single element." > Now if one platform supports it on one type of map and another platform doesn't, then > the compiler has to magically know whether to allow this optimization (compared to > requiring using a helper function to access the map value) or not. Compiler has no idea. All ld_imm64 and call insns look the same. The compiler emits them the same way. The src_reg encoding is what libbpf does based on compiler relocations. Then the verifier checks them differently and later JIT sees _all_ ld_imm64 as one type of instruction. Same with call insn. To x86/arm64/riscv JITs there is only one BPF CALL insn. > > > Maybe we should do risc-v like group instead? > > Just these 4: > > - Base Integer Instruction Set, 32-bit > > - Base Integer Instruction Set, 64-bit > > If there's platforms that would support one of the above and not the other > (are there?) then I agree splitting them would make sense. nfp is an example. It kinda sorta supports 64-bit, but very much prefers 32. All 32-bit architectures is another example. They JIT nicely all 32-bit ops and struggle with 64.