Hello! I have been working on this JIT during last couple of weeks, following my initial questions and thoughs around this in April ("Completing eBPF JIT support for MIPS32"). Perhaps I should have been clearer that I intended to add the missing functionality, but when I received no response, no activity on the subject since 2019, and with MIPS the company switching to RISC-V, I frankly did not think anyone else was interested. I was not aware that Tony was working on the same thing. Anyway, here it goes. This is an implementation of an eBPF JIT for MIPS I-V and MIPS32. The implementation supports all 32-bit and 64-bit ALU and JMP operations, including the recently-added atomics. 64-bit div/mod and 64-bit atomics are implemented using function calls to math64 and atomic64 functions, respectively. All 32-bit operations are implemented natively by the JIT. The implemention is intended to provide good ALU32 performance, and completeness for ALU64 instructions so it never has to fall back to the interpreter. Care has also been taken to make the code as simple and clean as possible. Complex and input-sensitive logic that is hard to test has intentionally been avoided, especially for ALU64 operations. The JIT relies on the verifier to do more complex analysis such as explicit zero-extension. Relation to the MIPS64 JIT ========================== The decision to not extend the existing MIPS64 JIT with 32-bit support was made for the following reasons. First, the 64-bit JIT is already very complex. It contains its own static analyzer for doing zero- and sign-extensions on 32-bit values. That is complexity not needed for the 32-bit JIT. Second, the 32-bit JIT has more in common with other 32-bit JITs, say, ARM, than MIPS64. The register mapping will be different. ALU32 operations are different. ALU64 operations are different. JMP/JMP32 operations are different. What is native word size and easy on one is emulated and difficult on the other, and vice-versa. There may of course be utility code that can be shared between the two JITs, but as a whole the 32-bit and 64-bit JITs are likely easier to test and maintain as separate, dedicated implementations rather than as one big JIT that needs to handle a super-set of the combined omplexity. Register mapping ================ All 64-bit eBPF registers are mapped to native 32-bit MIPS register pairs, and does not use any stack scratch space for register swapping. This means that all eBPF register data is kept in CPU registers all the time, which is good for performance of course. It also simplifies the register management a lot and reduces the hunger for temporary registers since we do not have to move data around. Native register pairs are ordered according to CPU endianness, following the O32 calling convention for passing 64-bit arguments and return values. The eBPF return value, arguments and callee-saved registers are mapped to their native MIPS equivalents. Since the 32 highest bits in the eBPF FP (frame pointer) register are always zero, only one general-purpose register is actually needed for the mapping. The MIPS fp register is used for this purpose. The high bits are mapped to MIPS register r0. This saves us one CPU register, which is much needed for temporaries, while still allowing us to treat the R10 (FP) register just like any other eBPF register in the JIT. The MIPS gp (global pointer) and at (assembler temporary) registers are used as internal temporary registers for constant blinding. CPU registers t6-t9 are used internally by the JIT when constructing more complex 64-bit operations. This is precisely what is needed - two registers to store an immediate operand value, and two more as scratch registers to perform the operation. The register mapping is shown below. R0 - $v1, $v0 return value R1 - $a1, $a0 argument 1, passed in registers R2 - $a3, $a2 argument 2, passed in registers R3 - $t1, $t0 argument 3, passed on stack R4 - $t3, $t2 argument 4, passed on stack R5 - $t4, $t3 argument 5, passed on stack R6 - $s1, $s0 callee-saved R7 - $s3, $s2 callee-saved R8 - $s5, $s4 callee-saved R9 - $s7, $s6 callee-saved FP - $r0, $fp 32-bit frame pointer AX - $gp, $at constant-blinding $t6 - $t9 unallocated, JIT temporaries Jump offsets ============ The JIT tries to map all conditional JMP operations to MIPS conditional PC-relative branches. The MIPS branch offset field is 18 bits, in bytes, which is equivalent to the eBPF 16-bit instruction offset. However, since the JIT may emit more than one CPU instruction per eBPF instruction, the value may overflow the field width. If that happens, the JIT converts the long conditional jump to a short PC-relative branch with the condition inverted, jumping over a long unconditional absolute jmp (ja). This conversion will change the instruction offset mapping used for jumps, and may in turn result in more branch offset overflows. The JIT therefore dry-runs the translation until no more branches are converted and the offsets do not change anymore. There is an upper bound on this of course, and if the JIT hits that limit, the last two iterations are run with all branches being converted. Testing ======= The implementation has been verified with the BPF test suite on QEMU, emulating MIPS32r2 in big and little endian configurations. It has also been verified on a MIPS 24Kc CPU (MT7628 SoC, little endian). The MIPS I-V variants that exist for some operations has been verified "manually" by forcing fallback to pre-r1 instructions only. As of this writing, the BPF test suite JITs all tests successfully. test_bpf: Summary: 378 PASSED, 0 FAILED, [366/366 JIT'ed] During the development of this JIT, several new tests were added to the test suite in order to test corner cases inherent to 32-bit JITs, tail calls and also to actually trigger the branch conversion handling. That is another patch set, though. Cheers, Johan Johan Almbladh (2): mips: bpf: add eBPF JIT for 32-bit MIPS mips: bpf: enable 32-bit eBPF JIT arch/mips/Kconfig | 5 +- arch/mips/net/Makefile | 7 +- arch/mips/net/ebpf_jit_32.c | 2207 +++++++++++++++++++++++++++++++++++ 3 files changed, 2216 insertions(+), 3 deletions(-) create mode 100644 arch/mips/net/ebpf_jit_32.c -- 2.25.1