On Fri, 2020-09-11 at 14:58 +0200, Ilya Leoshkevich wrote: > On Thu, 2020-09-10 at 17:25 -0700, Alexei Starovoitov wrote: > > On Wed, Sep 9, 2020 at 4:37 PM Ilya Leoshkevich <iii@xxxxxxxxxxxxx> > > wrote: > > > If the original insn is a jump, then it is not subjected to > > > branch > > > adjustment, which is incorrect. As discovered by Yauheni in > > > > I think the problem is elsewhere. > > Something is wrong with zext logic. > > the branch insn should not have been marked as zext_dst. > > and in the line: > > zext_patch[0] = insn; > > this 'insn' should never be a branch. > > See insn_no_def(). > > Would it make sense to add a WARN_ON(insn_no_def(&insn)) there? > > > I believe the root cause is triggered by clear_caller_saved_regs(). > > This is our prog: > > [ 0]: BPF_JMP | BPF_CALL | BPF_K, BPF_REG_0, BPF_REG_1, 0x0, 0x1 > [ 1]: BPF_JMP | BPF_EXIT | BPF_K, BPF_REG_0, BPF_REG_0, 0x0, 0x0 > [ 2]: BPF_JMP | BPF_CALL | BPF_K, BPF_REG_0, BPF_REG_1, 0x0, 0x1 > [ 3]: BPF_JMP | BPF_EXIT | BPF_K, BPF_REG_0, BPF_REG_0, 0x0, 0x0 > ... > > and env->insn_idx is 2. clear_caller_saved_regs() calls > > check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK); > > for register 0, and then inside check_reg_arg() we come to > > reg->subreg_def = rw64 ? DEF_NOT_SUBREG : env->insn_idx + 1; > > where rw64 is false, because insn 2 is a BPF_PSEUDO_CALL. Having > non-zero subreg_def causes mark_insn_zext() to set zext_dst later on. > > Maybe mark_reg_unknown() can do something to prevent this? My knee- > jerk > reaction would be to set subreg_def to 0 there, but I'm not sure > whether this would be correct. Another possible fix (inspired by helper function call handling) is: --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4751,6 +4751,7 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn, /* All global functions return SCALAR_VALUE */ mark_reg_unknown(env, caller->regs, BPF_REG_0); + caller->regs[BPF_REG_0].subreg_def = DEF_NOT_SUBREG; /* continue with next insn after call */ return 0; This relies on global functions always returning 64-bit values, which I believe should always be the case. If this sounds reasonable, I can send a proper patch.