On Fri, 2023-11-17 at 11:46 -0500, Andrii Nakryiko wrote: [...] > > +static bool is_callback_iter_next(struct bpf_verifier_env *env, int insn_idx); > > + > > /* For given verifier state backtrack_insn() is called from the last insn to > > * the first insn. Its purpose is to compute a bitmask of registers and > > * stack slots that needs precision in the parent verifier state. > > @@ -4030,10 +4044,7 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx, > > return -EFAULT; > > return 0; > > } > > - } else if ((bpf_helper_call(insn) && > > - is_callback_calling_function(insn->imm) && > > - !is_async_callback_calling_function(insn->imm)) || > > - (bpf_pseudo_kfunc_call(insn) && is_callback_calling_kfunc(insn->imm))) { > > + } else if (is_sync_callback_calling_insn(insn) && idx != subseq_idx - 1) { > > can you leave a comment why we need idx != subseq_idx - 1 check? This check is needed to make sure that we on the arc from callback return to callback calling function, I'll extend the comment below. > > /* callback-calling helper or kfunc call, which means > > * we are exiting from subprog, but unlike the subprog > > * call handling above, we shouldn't propagate > > [...] > > > @@ -12176,6 +12216,21 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, > > return -EACCES; > > } > > > > + /* Check the arguments */ > > + err = check_kfunc_args(env, &meta, insn_idx); > > + if (err < 0) > > + return err; > > + > > + if (meta.func_id == special_kfunc_list[KF_bpf_rbtree_add_impl]) { > > can't we use is_sync_callback_calling_kfunc() here? No, because it uses 'set_rbtree_add_callback_state' as a parameter, specific to rbtree_add, not just any kfunc. > > + err = push_callback_call(env, insn, insn_idx, meta.subprogno, > > + set_rbtree_add_callback_state); > > + if (err) { > > + verbose(env, "kfunc %s#%d failed callback verification\n", > > + func_name, meta.func_id); > > + return err; > > + } > > + } > > + [...] > > diff --git a/tools/testing/selftests/bpf/prog_tests/cb_refs.c b/tools/testing/selftests/bpf/prog_tests/cb_refs.c > > index 3bff680de16c..b5aa168889c1 100644 > > --- a/tools/testing/selftests/bpf/prog_tests/cb_refs.c > > +++ b/tools/testing/selftests/bpf/prog_tests/cb_refs.c > > @@ -21,12 +21,14 @@ void test_cb_refs(void) > > { > > LIBBPF_OPTS(bpf_object_open_opts, opts, .kernel_log_buf = log_buf, > > .kernel_log_size = sizeof(log_buf), > > - .kernel_log_level = 1); > > + .kernel_log_level = 1 | 2 | 4); > > nit: 1 is redundant if 2 is specified, so just `2 | 4` ? This is a leftover, sorry, I'll remove changes to cb_refs.c. [...] > > diff --git a/tools/testing/selftests/bpf/progs/verifier_subprog_precision.c b/tools/testing/selftests/bpf/progs/verifier_subprog_precision.c > > index db6b3143338b..ead358679fe2 100644 > > --- a/tools/testing/selftests/bpf/progs/verifier_subprog_precision.c > > +++ b/tools/testing/selftests/bpf/progs/verifier_subprog_precision.c > > @@ -120,14 +120,12 @@ __naked int global_subprog_result_precise(void) > > SEC("?raw_tp") > > __success __log_level(2) > > __msg("14: (0f) r1 += r6") > > -__msg("mark_precise: frame0: last_idx 14 first_idx 10") > > +__msg("mark_precise: frame0: last_idx 14 first_idx 9") > > __msg("mark_precise: frame0: regs=r6 stack= before 13: (bf) r1 = r7") > > __msg("mark_precise: frame0: regs=r6 stack= before 12: (27) r6 *= 4") > > __msg("mark_precise: frame0: regs=r6 stack= before 11: (25) if r6 > 0x3 goto pc+4") > > __msg("mark_precise: frame0: regs=r6 stack= before 10: (bf) r6 = r0") > > -__msg("mark_precise: frame0: parent state regs=r0 stack=:") > > -__msg("mark_precise: frame0: last_idx 18 first_idx 0") > > -__msg("mark_precise: frame0: regs=r0 stack= before 18: (95) exit") > > +__msg("mark_precise: frame0: regs=r0 stack= before 9: (85) call bpf_loop") > > you are right that r0 returned from bpf_loop is not r0 returned from > bpf_loop's callback, but we still have to go through callback > instructions, right? Should we? We are looking to make r0 precise, but what are the rules for propagating that across callback boundary? For bpf_loop() and for bpf_for_each_map_elem() that would be marking r0 inside callback as precise, but in general that is callback specific. In a separate discussion with you and Alexei you mentioned that you are going to send a patch-set that would force all r0 precise on exit, which would cover current situation. Imo, it would make sense to wait for that patch-set, as it would be simpler than changes in backtrack_insn(), wdyt? > so you removed few __msg() from subprog > instruction history because it was too long a history or what? I'd > actually keep those but update that in subprog we don't need r0 to be > precise, that will make this test even clearer > > > __naked int callback_result_precise(void) Here is relevant log fragment: 14: (0f) r1 += r6 mark_precise: frame0: last_idx 14 first_idx 9 subseq_idx -1 mark_precise: frame0: regs=r6 stack= before 13: (bf) r1 = r7 mark_precise: frame0: regs=r6 stack= before 12: (27) r6 *= 4 mark_precise: frame0: regs=r6 stack= before 11: (25) if r6 > 0x3 goto pc+4 mark_precise: frame0: regs=r6 stack= before 10: (bf) r6 = r0 mark_precise: frame0: regs=r0 stack= before 9: (85) call bpf_loop#181 15: R1_w=map_value(off=0,ks=4,vs=16,smin=smin32=0,smax=umax=smax32=umax32=12,var_off=(0x0; 0xc)) R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=12,var_off=(0x0; 0xc)) 15: (61) r0 = *(u32 *)(r1 +0) ; R0_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) R1_w=map_value(off=0,ks=4,vs=16,smin=smin32=0,smax=umax=smax32=umax32=12,var_off=(0x0; 0xc)) 16: (95) exit