On Thu, May 16, 2024 at 4:59 PM Kumar Kartikeya Dwivedi <memxor@xxxxxxxxx> wrote: > > On Fri, 10 May 2024 at 21:24, Amery Hung <ameryhung@xxxxxxxxx> wrote: > > > > This patch supports struct_ops programs that acqurie referenced kptrs > > throguh arguments. In Qdisc_ops, an skb is passed to ".enqueue" in the > > first argument. The qdisc becomes the sole owner of the skb and must > > enqueue or drop the skb. This matches the referenced kptr semantic > > in bpf. However, the existing practice of acquiring a referenced kptr via > > a kfunc with KF_ACQUIRE does not play well in this case. Calling kfuncs > > repeatedly allows the user to acquire multiple references, while there > > should be only one reference to a unique skb in a qdisc. > > > > The solutioin is to make a struct_ops program automatically acquire a > > referenced kptr through a tagged argument in the stub function. When > > tagged with "__ref_acquired" (suggestion for a better name?), an > > reference kptr (ref_obj_id > 0) will be acquired automatically when > > entering the program. In addition, only the first read to the arguement > > is allowed and it will yeild a referenced kptr. > > > > Signed-off-by: Amery Hung <amery.hung@xxxxxxxxxxxxx> > > --- > > include/linux/bpf.h | 3 +++ > > kernel/bpf/bpf_struct_ops.c | 17 +++++++++++++---- > > kernel/bpf/btf.c | 10 +++++++++- > > kernel/bpf/verifier.c | 16 +++++++++++++--- > > 4 files changed, 38 insertions(+), 8 deletions(-) > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > > index 9c6a7b8ff963..6aabca1581fe 100644 > > --- a/include/linux/bpf.h > > +++ b/include/linux/bpf.h > > @@ -914,6 +914,7 @@ struct bpf_insn_access_aux { > > struct { > > struct btf *btf; > > u32 btf_id; > > + u32 ref_obj_id; > > }; > > }; > > struct bpf_verifier_log *log; /* for verbose logs */ > > @@ -1416,6 +1417,8 @@ struct bpf_ctx_arg_aux { > > enum bpf_reg_type reg_type; > > struct btf *btf; > > u32 btf_id; > > + u32 ref_obj_id; > > + bool ref_acquired; > > }; > > > > struct btf_mod_pair { > > diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c > > index 86c7884abaf8..bca8e5936846 100644 > > --- a/kernel/bpf/bpf_struct_ops.c > > +++ b/kernel/bpf/bpf_struct_ops.c > > @@ -143,6 +143,7 @@ void bpf_struct_ops_image_free(void *image) > > } > > > > #define MAYBE_NULL_SUFFIX "__nullable" > > +#define REF_ACQUIRED_SUFFIX "__ref_acquired" > > #define MAX_STUB_NAME 128 > > > > /* Return the type info of a stub function, if it exists. > > @@ -204,6 +205,7 @@ static int prepare_arg_info(struct btf *btf, > > struct bpf_struct_ops_arg_info *arg_info) > > { > > const struct btf_type *stub_func_proto, *pointed_type; > > + bool is_nullable = false, is_ref_acquired = false; > > const struct btf_param *stub_args, *args; > > struct bpf_ctx_arg_aux *info, *info_buf; > > u32 nargs, arg_no, info_cnt = 0; > > @@ -240,8 +242,11 @@ static int prepare_arg_info(struct btf *btf, > > /* Skip arguments that is not suffixed with > > * "__nullable". > > */ > > - if (!btf_param_match_suffix(btf, &stub_args[arg_no], > > - MAYBE_NULL_SUFFIX)) > > + is_nullable = btf_param_match_suffix(btf, &stub_args[arg_no], > > + MAYBE_NULL_SUFFIX); > > + is_ref_acquired = btf_param_match_suffix(btf, &stub_args[arg_no], > > + REF_ACQUIRED_SUFFIX); > > + if (!(is_nullable || is_ref_acquired)) > > continue; > > > > /* Should be a pointer to struct */ > > @@ -269,11 +274,15 @@ static int prepare_arg_info(struct btf *btf, > > } > > > > /* Fill the information of the new argument */ > > - info->reg_type = > > - PTR_TRUSTED | PTR_TO_BTF_ID | PTR_MAYBE_NULL; > > info->btf_id = arg_btf_id; > > info->btf = btf; > > info->offset = offset; > > + if (is_nullable) { > > + info->reg_type = PTR_TRUSTED | PTR_TO_BTF_ID | PTR_MAYBE_NULL; > > + } else if (is_ref_acquired) { > > + info->reg_type = PTR_TRUSTED | PTR_TO_BTF_ID; > > + info->ref_acquired = true; > > + } > > > > info++; > > info_cnt++; > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c > > index 8c95392214ed..e462fb4a4598 100644 > > --- a/kernel/bpf/btf.c > > +++ b/kernel/bpf/btf.c > > @@ -6316,7 +6316,8 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type, > > > > /* this is a pointer to another type */ > > for (i = 0; i < prog->aux->ctx_arg_info_size; i++) { > > - const struct bpf_ctx_arg_aux *ctx_arg_info = &prog->aux->ctx_arg_info[i]; > > + struct bpf_ctx_arg_aux *ctx_arg_info = > > + (struct bpf_ctx_arg_aux *)&prog->aux->ctx_arg_info[i]; > > > > if (ctx_arg_info->offset == off) { > > if (!ctx_arg_info->btf_id) { > > @@ -6324,9 +6325,16 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type, > > return false; > > } > > > > + if (ctx_arg_info->ref_acquired && !ctx_arg_info->ref_obj_id) { > > + bpf_log(log, "cannot acquire a reference to context argument offset %u\n", off); > > + return false; > > + } > > + > > info->reg_type = ctx_arg_info->reg_type; > > info->btf = ctx_arg_info->btf ? : btf_vmlinux; > > info->btf_id = ctx_arg_info->btf_id; > > + info->ref_obj_id = ctx_arg_info->ref_obj_id; > > + ctx_arg_info->ref_obj_id = 0; > > return true; > > I think this is fragile. What if the compiler produces two independent > paths in the program which read the skb pointer once? > Technically, the program is still reading the skb pointer once at runtime. > Then you will reset ref_obj_id to 0 when exploring one, and assign as > 0 in the other one, causing errors. > ctx_arg_info appears to be global for the program. > > I think the better way would be to check if ref_obj_id is still part > of the reference state. > If the ref_obj_id has already been dropped from reference_state, then > any loads should get ref_obj_id = 0. > That would happen when dropping or enqueueing the skb into qdisc, > which would (I presume) do release_reference_state(ref_obj_id). > Otherwise, all of them can share the same ref_obj_id. You won't have > to implement "can only read once" logic, > and when you enqueue stuff in the qdisc, all identical copies produced > from different load instructions will be invalidated. > Same ref_obj_id == unique ownership of the same object. > You can already have multiple copies through rX = rY, multiple ctx > loads of skb will produce a similar verifier state. > > So, on entry, assign ctx_arg_info->ref_obj_id uniquely, then on each load: > if reference_state.find(ctx_arg_info->ref_obj_id) == true; then > info->ref_obj_id = ctx_arg_info->ref_obj_id; else info->ref_obj_id = > 0; > > Let me know if I missed something. You are right. The current approach will falsely reject valid programs, and your suggestion makes sense. Thanks, Amery