On Thu, Sep 07 2023, Alexei Starovoitov wrote: > On Thu, Sep 7, 2023 at 12:33 AM Puranjay Mohan <puranjay12@xxxxxxxxx> wrote: >> >> On Wed, Sep 06 2023, Alexei Starovoitov wrote: >> >> > On Fri, Sep 1, 2023 at 7:57 AM Puranjay Mohan <puranjay12@xxxxxxxxx> wrote: >> >> >> >> On Fri, Sep 01 2023, Puranjay Mohan wrote: >> >> >> >> > The problem here is that reg->subreg_def should be set as DEF_NOT_SUBREG for >> >> > registers that are used as destination registers of BPF_LDX | >> >> > BPF_MEMSX. I am seeing >> >> > the same problem on ARM32 and was going to send a patch today. >> >> > >> >> > The problem is that is_reg64() returns false for destination registers >> >> > of BPF_LDX | BPF_MEMSX. >> >> > But BPF_LDX | BPF_MEMSX always loads a 64 bit value because of the >> >> > sign extension so >> >> > is_reg64() should return true. >> >> > >> >> > I have written a patch that I will be sending as a reply to this. >> >> > Please let me know if that makes sense. >> >> > >> >> >> >> The check_reg_arg() function will mark reg->subreg_def = DEF_NOT_SUBREG for destination >> >> registers if is_reg64() returns true for these registers. My patch below make is_reg64() >> >> return true for destination registers of BPF_LDX with mod = BPF_MEMSX. I feel this is the >> >> correct way to fix this problem. >> >> >> >> Here is my patch: >> >> >> >> --- 8< --- >> >> From cf1bf5282183cf721926ab14d968d3d4097b89b8 Mon Sep 17 00:00:00 2001 >> >> From: Puranjay Mohan <puranjay12@xxxxxxxxx> >> >> Date: Fri, 1 Sep 2023 11:18:59 +0000 >> >> Subject: [PATCH bpf] bpf: verifier: mark destination of sign-extended load as >> >> 64 bit >> >> >> >> The verifier can emit instructions to zero-extend destination registers >> >> when the register is being used to keep 32 bit values. This behaviour is >> >> enabled only when the JIT sets bpf_jit_needs_zext() -> true. In the case >> >> of a sign extended load instruction, the destination register always has a >> >> 64-bit value, therefore the verifier should not emit zero-extend >> >> instructions for it. >> >> >> >> Change is_reg64() to return true if the register under consideration is a >> >> destination register of LDX instruction with mode = BPF_MEMSX. >> >> >> >> Fixes: 1f9a1ea821ff ("bpf: Support new sign-extension load insns") >> >> Signed-off-by: Puranjay Mohan <puranjay12@xxxxxxxxx> >> >> --- >> >> kernel/bpf/verifier.c | 2 +- >> >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> >> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> >> index bb78212fa5b2..93f84b868ccc 100644 >> >> --- a/kernel/bpf/verifier.c >> >> +++ b/kernel/bpf/verifier.c >> >> @@ -3029,7 +3029,7 @@ static bool is_reg64(struct bpf_verifier_env *env, struct bpf_insn *insn, >> >> >> >> if (class == BPF_LDX) { >> >> if (t != SRC_OP) >> >> - return BPF_SIZE(code) == BPF_DW; >> >> + return (BPF_SIZE(code) == BPF_DW || BPF_MODE(code) == BPF_MEMSX); >> > >> > Looks like we have a bug here for normal LDX too. >> > This 'if' condition was inserting unnecessary zext for LDX. >> > It was harmless for LDX and broken for LDSX. >> > Both LDX and LDSX write all bits of 64-bit register. >> > >> > I think the proper fix is to remove above two lines. >> > wdyt? >> >> For LDX this returns true only if it is with a BPF_DW, for others it returns false. >> This means a zext is inserted for BPF_LDX | BPF_B/H/W. >> >> This is not a bug because LDX writes 64 bits of the register only with BPF_DW. >> With BPF_B/H/W It only writes the lower 32bits and needs zext for upper 32 bits. > > No. The interpreter writes all 64-bit for any LDX insn. > All JITs must do it as well. > >> On 32 bit architectures where a 64-bit BPF register is simulated with two 32-bit registers, >> explicit zext is required for BPF_LDX | BPF_B/H/W. > > zext JIT-aid done by the verifier has nothing to do with 32-bit architecture. > It's necessary on 64-bit as well when HW doesn't automatically zero out > upper 32-bit like it does on arm64 and x86-64 Yes, I agree that zext JIT-aid is required for all 32-bit architectures and some 64-bit architectures that can't automatically zero out the upper 32-bits. Basically any architecture that sets bpf_jit_needs_zext() -> true. >> So, we should not remove this. > > I still think we should. If we remove this then some JITs will not zero extend the upper 32-bits for BPF_LDX | BPF_B/H/W. My understanding is that Verifier sets prog->aux->verifier_zext if it emits zext instructions. If the verifier doesn't emit zext for LDX but sets prog->aux->verifier_zext that would cause wrong behavior for some JITs: Example code from ARM32 jit doing BPF_LDX | BPF_MEM | BPF_B: case BPF_B: /* Load a Byte */ emit(ARM_LDRB_I(rd[1], rm, off), ctx); if (!ctx->prog->aux->verifier_zext) emit_a32_mov_i(rd[0], 0, ctx); break; Here if ctx->prog->aux->verifier_zext is set by the verifier, and zext was not emitted for LDX, JIT will not zero the upper 32-bits. RISCV32, PowerPC32, x86-32 JITs have similar code paths. Only MIPS32 JIT zero-extends for LDX without checking prog->aux->verifier_zext. So, if we want to stop emitting zext for LDX then we would need to modify all these JITs to always zext for LDX. Let me know if my understanding has some gaps, also if we decide to remove it, I am happy to send patches for it and fix the JITs that need modifications. Thanks, Puranjay