On Wed, Oct 11, 2023 at 4:42 AM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Tue, Oct 10, 2023 at 1:33 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: > > > > On 10/10/23 9:02 AM, John Fastabend wrote: > > > Hao Sun wrote: > > >> Currently, we don't check if the branch-taken of a jump is reserved code of > > >> ld_imm64. Instead, such a issue is captured in check_ld_imm(). The verifier > > >> gives the following log in such case: > > >> > > >> func#0 @0 > > >> 0: R1=ctx(off=0,imm=0) R10=fp0 > > >> 0: (18) r4 = 0xffff888103436000 ; R4_w=map_ptr(off=0,ks=4,vs=128,imm=0) > > >> 2: (18) r1 = 0x1d ; R1_w=29 > > >> 4: (55) if r4 != 0x0 goto pc+4 ; R4_w=map_ptr(off=0,ks=4,vs=128,imm=0) > > >> 5: (1c) w1 -= w1 ; R1_w=0 > > >> 6: (18) r5 = 0x32 ; R5_w=50 > > >> 8: (56) if w5 != 0xfffffff4 goto pc-2 > > >> mark_precise: frame0: last_idx 8 first_idx 0 subseq_idx -1 > > >> mark_precise: frame0: regs=r5 stack= before 6: (18) r5 = 0x32 > > >> 7: R5_w=50 > > >> 7: BUG_ld_00 > > >> invalid BPF_LD_IMM insn > > >> > > >> Here the verifier rejects the program because it thinks insn at 7 is an > > >> invalid BPF_LD_IMM, but such a error log is not accurate since the issue > > >> is jumping to reserved code not because the program contains invalid insn. > > >> Therefore, make the verifier check the jump target during check_cfg(). For > > >> the same program, the verifier reports the following log: > > > > > > I think we at least would want a test case for this. Also how did you create > > > this case? Is it just something you did manually and noticed a strange error? > > > > Curious as well. > > > > We do have test cases which try to jump into the middle of a double insn as can > > be seen that this patch breaks BPF CI with regards to log mismatch below (which > > still needs to be adapted, too). Either way, it probably doesn't hurt to also add > > the above snippet as a test. > > > > Hao, as I understand, the patch here is an usability improvement (not a fix per se) > > where we reject such cases earlier during cfg check rather than at a later point > > where we validate ld_imm instruction. Or are there cases you found which were not > > yet captured via current check_ld_imm()? > > > > test_verifier failure log : > > > > #458/u test1 ld_imm64 FAIL > > Unexpected verifier log! > > EXP: R1 pointer comparison > > RES: > > FAIL > > Unexpected error message! > > EXP: R1 pointer comparison > > RES: jump to reserved code from insn 0 to 2 > > verification time 22 usec > > stack depth 0 > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 > > > > jump to reserved code from insn 0 to 2 > > verification time 22 usec > > stack depth 0 > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 > > #458/p test1 ld_imm64 FAIL > > Unexpected verifier log! > > EXP: invalid BPF_LD_IMM insn > > RES: > > FAIL > > Unexpected error message! > > EXP: invalid BPF_LD_IMM insn > > RES: jump to reserved code from insn 0 to 2 > > verification time 9 usec > > stack depth 0 > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 > > > > jump to reserved code from insn 0 to 2 > > verification time 9 usec > > stack depth 0 > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 > > #459/u test2 ld_imm64 FAIL > > Unexpected verifier log! > > EXP: R1 pointer comparison > > RES: > > FAIL > > Unexpected error message! > > EXP: R1 pointer comparison > > RES: jump to reserved code from insn 0 to 2 > > verification time 11 usec > > stack depth 0 > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 > > > > jump to reserved code from insn 0 to 2 > > verification time 11 usec > > stack depth 0 > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 > > #459/p test2 ld_imm64 FAIL > > Unexpected verifier log! > > EXP: invalid BPF_LD_IMM insn > > RES: > > FAIL > > Unexpected error message! > > EXP: invalid BPF_LD_IMM insn > > RES: jump to reserved code from insn 0 to 2 > > verification time 8 usec > > stack depth 0 > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 > > > > jump to reserved code from insn 0 to 2 > > verification time 8 usec > > stack depth 0 > > processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 > > #460/u test3 ld_imm64 OK > > > > >> func#0 @0 > > >> jump to reserved code from insn 8 to 7 > > >> > > >> --- > > >> > > >> > > >> Signed-off-by: Hao Sun <sunhao.th@xxxxxxxxx> > > > > nit: This needs to be before the "---" line. > > > > >> --- > > >> kernel/bpf/verifier.c | 7 +++++++ > > >> 1 file changed, 7 insertions(+) > > >> > > >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > >> index eed7350e15f4..725ac0b464cf 100644 > > >> --- a/kernel/bpf/verifier.c > > >> +++ b/kernel/bpf/verifier.c > > >> @@ -14980,6 +14980,7 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env, > > >> { > > >> int *insn_stack = env->cfg.insn_stack; > > >> int *insn_state = env->cfg.insn_state; > > >> + struct bpf_insn *insns = env->prog->insnsi; > > >> > > >> if (e == FALLTHROUGH && insn_state[t] >= (DISCOVERED | FALLTHROUGH)) > > >> return DONE_EXPLORING; > > >> @@ -14993,6 +14994,12 @@ static int push_insn(int t, int w, int e, struct bpf_verifier_env *env, > > >> return -EINVAL; > > >> } > > >> > > >> + if (e == BRANCH && insns[w].code == 0) { > > >> + verbose_linfo(env, t, "%d", t); > > >> + verbose(env, "jump to reserved code from insn %d to %d\n", t, w); > > >> + return -EINVAL; > > >> + } > > > > Other than that, lgtm. > > We do rely quite a lot on verifier not complaining eagerly about some > potentially invalid instructions if it's provable that some portion of > the code won't ever be reached (think using .rodata variables for > feature gating, poisoning intructions due to failed CO-RE relocation, > which libbpf does actively, except it's using a call to non-existing > helper). As such, check_cfg() is a wrong place to do such validity > checks because some of the branches might never be run and validated > in practice. > Don't really agree. Jump to the middle of ld_imm64 is just like jumping out of bounds, both break the CFG integrity immediately. For those apparently incorrect jumps, rejecting early makes everything simple; otherwise, we probably need some rewrite in the end. Also, as you mentioned, libbpf relies on non-existing helpers, not jump to the middle of ld_imm64. It seems better and easier to not leave this hole. > This seems like a pretty obscure case of fuzzer generated test with > random jumps into the middle of ldimm64 instruction. I think the tool > should be able to avoid this or handle verifier log just fine in such > situations. On the other hand, valid code generated by compilers will > never have such jumps. > > So perhaps we can improve existing "invalid BPF_LD_IMM insn" message, > but let's not teach check_cfg() more checks than necessary? > Improving that `invalid BPF_LD_IMM` log does not solve the problem, the issue here is an invalid jump. Also, there could be various causes that make the verifier see an invalid BPF_LD_IMM in check_ld_imm(). > > > > >> if (e == BRANCH) { > > >> /* mark branch target for state pruning */ > > >> mark_prune_point(env, w); > > >> > >