On Tue, Apr 2, 2024 at 8:48 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: > > On 4/2/24 1:38 AM, Alexei Starovoitov wrote: > > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > > > Turned out that bpf prog callback addresses, bpf prog addresses > > used in bpf_trampoline, and in other cases the 64-bit address > > can be represented as sign extended 32-bit value. > > According to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82339 > > "Skylake has 0.64c throughput for mov r64, imm64, vs. 0.25 for mov r32, imm32." > > So use shorter encoding and faster instruction when possible. > > > > Special care is needed in jit_subprogs(), since bpf_pseudo_func() > > instruction cannot change its size during the last step of JIT. > > > > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx> > > --- > > arch/x86/net/bpf_jit_comp.c | 5 ++++- > > kernel/bpf/verifier.c | 13 ++++++++++--- > > 2 files changed, 14 insertions(+), 4 deletions(-) > > > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c > > index 3b639d6f2f54..47abddac6dc3 100644 > > --- a/arch/x86/net/bpf_jit_comp.c > > +++ b/arch/x86/net/bpf_jit_comp.c > > @@ -816,9 +816,10 @@ static void emit_mov_imm32(u8 **pprog, bool sign_propagate, > > static void emit_mov_imm64(u8 **pprog, u32 dst_reg, > > const u32 imm32_hi, const u32 imm32_lo) > > { > > + u64 imm64 = ((u64)imm32_hi << 32) | (u32)imm32_lo; > > u8 *prog = *pprog; > > > > - if (is_uimm32(((u64)imm32_hi << 32) | (u32)imm32_lo)) { > > + if (is_uimm32(imm64)) { > > /* > > * For emitting plain u32, where sign bit must not be > > * propagated LLVM tends to load imm64 over mov32 > > @@ -826,6 +827,8 @@ static void emit_mov_imm64(u8 **pprog, u32 dst_reg, > > * 'mov %eax, imm32' instead. > > */ > > emit_mov_imm32(&prog, false, dst_reg, imm32_lo); > > + } else if (is_simm32(imm64)) { > > + emit_mov_imm32(&prog, true, dst_reg, imm32_lo); > > } else { > > /* movabsq rax, imm64 */ > > EMIT2(add_1mod(0x48, dst_reg), add_1reg(0xB8, dst_reg)); > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index edb650667f44..d4a338e7b5e7 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -19145,12 +19145,19 @@ static int jit_subprogs(struct bpf_verifier_env *env) > > env->insn_aux_data[i].call_imm = insn->imm; > > /* point imm to __bpf_call_base+1 from JITs point of view */ > > insn->imm = 1; > > - if (bpf_pseudo_func(insn)) > > + if (bpf_pseudo_func(insn)) { > > +#if defined(MODULES_VADDR) > > + u64 addr = MODULES_VADDR; > > +#else > > + u64 addr = VMALLOC_START; > > +#endif > > Is this beneficial for all archs? It seems this patch is mainly targetting x86. > Why not having a weak function like u64 bpf_jit_alloc_exec_start() which returns > the MODULES_VADDR for x86, but leaves the rest as-is? > > For example, arm64 has MODULES_VADDR defined, but the allocator uses vmalloc > range instead, see bpf_jit_alloc_exec() there, so this is a different pool and > it's also not clear if this is better or worse wrt its imm encoding. This part makes no difference for all JITs except x86. Back when commit 3990ed4c4266 ("bpf: Stop caching subprog index in the bpf_pseudo_func insn") added the comment below: "jit (e.g. x86_64) may emit fewer instructions" pseudo_func-s were introduced for x86 and only x86 JIT has this behavior. Since then other JITs added support for pseudo_func-s, but none of them rely on this part of the verifier. So the comment still applies to x86 only (afaics). s390, riscv, arm64 went with: "if (bpf_pseudo_func)" process ld_imm64 differently regardless of what is the value of insn[0].imm, insn[1].imm. I think it's a bit wrong. I considered removing this if (bpf_pseudo_func(insn)) from verifier.c and doing a similar hack in x86 jit, but decided against that. The previous insn[1].imm = 1 was a hack targeted at x86. It served its purpose for 3 years. A hack, but imo cleaner than if (bpf_pseudo_func(insn)) in JITs. Since I'm making emit_mov_imm64() smarter, there is a need to make this part of the verifier.c a bit more accurate in terms of value it represents. MODULES_VADDR or VMALLOC_START doesn't make a difference. It's a kernel text address. It could be an (long)&_text. fwiw. I believe all JITs can potentially generalize if (bpf_pseudo_func(insn)) check into if (kernel_addr(imm64)), but that's a follow up for somebody. weak helper bpf_jit_alloc_exec_start() is certainly an overkill. pseudo_func callback doesn't have to be jit-ed bpf prog. It's the address of the function. If there is ever an arch where kernel and jit-ed code needs different insns to represent an address then we will tackle such issue at that time. Notice that we have similar #if defined(MODULES_VADDR) logic in bpf_jit_alloc_exec_limit() that was added 6 years ago and it's still fine. No need to over design this one either. > > > /* jit (e.g. x86_64) may emit fewer instructions > > * if it learns a u32 imm is the same as a u64 imm. > > - * Force a non zero here. > > + * Set close enough to possible prog address. > > */ > > - insn[1].imm = 1; > > + insn[0].imm = (u32)addr; > > + insn[1].imm = addr >> 32; > > + }