On 8/2/23 6:50 PM, Alexei Starovoitov wrote: > On Tue, Aug 01, 2023 at 01:36:28PM -0700, Dave Marchevsky wrote: >> The previous patch in the series ensures that the underlying memory of >> nodes with bpf_refcount - which can have multiple owners - is not reused >> until RCU Tasks Trace grace period has elapsed. This prevents > > Here and in the cover letter... above should probably be "RCU grace period" > and not "RCU tasks trace grace period". > bpf progs will reuse objects after normal RCU. > We're waiting for RCU tasks trace GP to free into slab. > Will fix. >> use-after-free with non-owning references that may point to >> recently-freed memory. While RCU read lock is held, it's safe to >> dereference such a non-owning ref, as by definition RCU GP couldn't have >> elapsed and therefore underlying memory couldn't have been reused. >> >> From the perspective of verifier "trustedness" non-owning refs to >> refcounted nodes are now trusted only in RCU CS and therefore should no >> longer pass is_trusted_reg, but rather is_rcu_reg. Let's mark them >> MEM_RCU in order to reflect this new state. >> >> Similarly to bpf_spin_unlock being a non-owning ref invalidation point, >> where non-owning ref reg states are clobbered so that they cannot be >> used outside of the critical section, currently all MEM_RCU regs are >> marked untrusted after bpf_rcu_read_unlock. This patch makes >> bpf_rcu_read_unlock a non-owning ref invalidation point as well, >> clobbering the non-owning refs instead of marking untrusted. In the >> future we may want to allow untrusted non-owning refs in which case we >> can remove this custom logic without breaking BPF programs as it's more >> restrictive than the default. That's a big change in semantics, though, >> and this series is focused on fixing the use-after-free in most >> straightforward way. >> >> Signed-off-by: Dave Marchevsky <davemarchevsky@xxxxxx> >> --- >> include/linux/bpf.h | 3 ++- >> kernel/bpf/verifier.c | 17 +++++++++++++++-- >> 2 files changed, 17 insertions(+), 3 deletions(-) >> >> diff --git a/include/linux/bpf.h b/include/linux/bpf.h >> index ceaa8c23287f..37fba01b061a 100644 >> --- a/include/linux/bpf.h >> +++ b/include/linux/bpf.h >> @@ -653,7 +653,8 @@ enum bpf_type_flag { >> MEM_RCU = BIT(13 + BPF_BASE_TYPE_BITS), >> >> /* Used to tag PTR_TO_BTF_ID | MEM_ALLOC references which are non-owning. >> - * Currently only valid for linked-list and rbtree nodes. >> + * Currently only valid for linked-list and rbtree nodes. If the nodes >> + * have a bpf_refcount_field, they must be tagged MEM_RCU as well. >> */ >> NON_OWN_REF = BIT(14 + BPF_BASE_TYPE_BITS), >> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> index 9014b469dd9d..4bda365000d3 100644 >> --- a/kernel/bpf/verifier.c >> +++ b/kernel/bpf/verifier.c >> @@ -469,7 +469,8 @@ static bool type_is_ptr_alloc_obj(u32 type) >> >> static bool type_is_non_owning_ref(u32 type) >> { >> - return type_is_ptr_alloc_obj(type) && type_flag(type) & NON_OWN_REF; >> + return type_is_ptr_alloc_obj(type) && >> + type_flag(type) & NON_OWN_REF; >> } >> >> static struct btf_record *reg_btf_record(const struct bpf_reg_state *reg) >> @@ -8012,6 +8013,7 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env, >> case PTR_TO_BTF_ID | PTR_TRUSTED: >> case PTR_TO_BTF_ID | MEM_RCU: >> case PTR_TO_BTF_ID | MEM_ALLOC | NON_OWN_REF: >> + case PTR_TO_BTF_ID | MEM_ALLOC | NON_OWN_REF | MEM_RCU: >> /* When referenced PTR_TO_BTF_ID is passed to release function, >> * its fixed offset must be 0. In the other cases, fixed offset >> * can be non-zero. This was already checked above. So pass >> @@ -10478,6 +10480,7 @@ static int process_kf_arg_ptr_to_btf_id(struct bpf_verifier_env *env, >> static int ref_set_non_owning(struct bpf_verifier_env *env, struct bpf_reg_state *reg) >> { >> struct bpf_verifier_state *state = env->cur_state; >> + struct btf_record *rec = reg_btf_record(reg); >> >> if (!state->active_lock.ptr) { >> verbose(env, "verifier internal error: ref_set_non_owning w/o active lock\n"); >> @@ -10490,6 +10493,9 @@ static int ref_set_non_owning(struct bpf_verifier_env *env, struct bpf_reg_state >> } >> >> reg->type |= NON_OWN_REF; >> + if (rec->refcount_off >= 0) >> + reg->type |= MEM_RCU; >> + >> return 0; >> } >> >> @@ -11327,10 +11333,16 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn, >> struct bpf_func_state *state; >> struct bpf_reg_state *reg; >> >> + if (in_rbtree_lock_required_cb(env) && (rcu_lock || rcu_unlock)) { >> + verbose(env, "can't rcu read {lock,unlock} in rbtree cb\n"); >> + return -EACCES; >> + } > > I guess it's ok to prevent cb from calling bpf_rcu_read_lock(), since it's unnecessary, > but pls make the message more verbose. Like: > verbose(env, "Calling bpf_rcu_read_{lock,unlock} in unnecessary rbtree callback\n"); > > so that users know why the verifier complains. > Technically it's ok to do so. Unnecessary is not a safety issue. > Well, for rcu_read_unlock it would be a safety issue, no? Feels easier to reason about if we can just say "RCU lock is held for the duration of the callback". >> + >> if (rcu_lock) { >> verbose(env, "nested rcu read lock (kernel function %s)\n", func_name); >> return -EINVAL; >> } else if (rcu_unlock) { >> + invalidate_non_owning_refs(env); > > I agree with Yonghong. It probably doesn't belong here. > rcu lock/unlock and spin_lock/unlock are separate critical sections. > Since ref_set_non_owning() adds extra MEM_RCU flag nothing extra needs to be done here. > Below code will make the pointers untrusted. > Will change. The desire here was to not loosen constraints on non-owning ref lifetime in this series. As I mention in my thoughts on Patch 3 in the cover letter, I do think we can loosen that in the future, but would like to avoid doing so in this fixes series. Regardless, because bpf_spin_unlock will happen before this executes, this line can be removed. >> bpf_for_each_reg_in_vstate(env->cur_state, state, reg, ({ >> if (reg->type & MEM_RCU) { >> reg->type &= ~(MEM_RCU | PTR_MAYBE_NULL); >> @@ -16679,7 +16691,8 @@ static int do_check(struct bpf_verifier_env *env) >> return -EINVAL; >> } >> >> - if (env->cur_state->active_rcu_lock) { >> + if (env->cur_state->active_rcu_lock && >> + !in_rbtree_lock_required_cb(env)) { > > I'm not following here. > Didn't you want to prevent bpf_rcu_read_lock/unlock inside cb? Why this change? > >> verbose(env, "bpf_rcu_read_unlock is missing\n"); >> return -EINVAL; >> } >> -- >> 2.34.1 >>