Re: [PATCH v2 bpf-next 5/7] bpf: Consider non-owning refs to refcounted nodes RCU protected

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 8/21/23 7:37 PM, Yonghong Song wrote:


On 8/21/23 12:33 PM, Dave Marchevsky wrote:
An earlier patch in the series ensures that the underlying memory of
nodes with bpf_refcount - which can have multiple owners - is not reused
until RCU grace period has elapsed. This prevents
use-after-free with non-owning references that may point to
recently-freed memory. While RCU read lock is held, it's safe to
dereference such a non-owning ref, as by definition RCU GP couldn't have
elapsed and therefore underlying memory couldn't have been reused.

 From the perspective of verifier "trustedness" non-owning refs to
refcounted nodes are now trusted only in RCU CS and therefore should no
longer pass is_trusted_reg, but rather is_rcu_reg. Let's mark them
MEM_RCU in order to reflect this new state.

Signed-off-by: Dave Marchevsky <davemarchevsky@xxxxxx>
---
  include/linux/bpf.h   |  3 ++-
  kernel/bpf/verifier.c | 13 ++++++++++++-
  2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index eced6400f778..12596af59c00 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -653,7 +653,8 @@ enum bpf_type_flag {
      MEM_RCU            = BIT(13 + BPF_BASE_TYPE_BITS),
      /* Used to tag PTR_TO_BTF_ID | MEM_ALLOC references which are non-owning.
-     * Currently only valid for linked-list and rbtree nodes.
+     * Currently only valid for linked-list and rbtree nodes. If the nodes
+     * have a bpf_refcount_field, they must be tagged MEM_RCU as well.
       */
      NON_OWN_REF        = BIT(14 + BPF_BASE_TYPE_BITS),
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 8db0afa5985c..55607ab30522 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -8013,6 +8013,7 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
      case PTR_TO_BTF_ID | PTR_TRUSTED:
      case PTR_TO_BTF_ID | MEM_RCU:
      case PTR_TO_BTF_ID | MEM_ALLOC | NON_OWN_REF:
+    case PTR_TO_BTF_ID | MEM_ALLOC | NON_OWN_REF | MEM_RCU:
          /* When referenced PTR_TO_BTF_ID is passed to release function,
           * its fixed offset must be 0. In the other cases, fixed offset
           * can be non-zero. This was already checked above. So pass
@@ -10479,6 +10480,7 @@ static int process_kf_arg_ptr_to_btf_id(struct bpf_verifier_env *env,   static int ref_set_non_owning(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
  {
      struct bpf_verifier_state *state = env->cur_state;
+    struct btf_record *rec = reg_btf_record(reg);
      if (!state->active_lock.ptr) {
          verbose(env, "verifier internal error: ref_set_non_owning w/o active lock\n"); @@ -10491,6 +10493,9 @@ static int ref_set_non_owning(struct bpf_verifier_env *env, struct bpf_reg_state
      }
      reg->type |= NON_OWN_REF;
+    if (rec->refcount_off >= 0)
+        reg->type |= MEM_RCU;

Should the above MEM_RCU marking be done unless reg access is in
rcu critical section?

I think we still have issues for state resetting
with bpf_spin_unlock() and bpf_rcu_read_unlock(), both of which
will try to convert the reg state to PTR_UNTRUSTED.

Let us say reg state is
   PTR_TO_BTF_ID | MEM_ALLOC | NON_OWN_REF | MEM_RCU

(1). If hitting bpf_spin_unlock(), since MEM_RCU is in
the reg state, the state should become
   PTR_TO_BTF_ID | MEM_ALLOC | MEM_RCU
some additional code might be needed so we wont have
verifier complaints about ref_obj_id == 0.

(2). If hitting bpf_rcu_read_unlock(), the state should become
   PTR_TO_BTF_ID | MEM_ALLOC | NON_OWN_REF
since register access still in bpf_spin_lock() region.

Does this make sense?

Okay, seems scenario (2) is not possible as bpf_rcu_read_unlock()
is not allowed in bpf spin lock region.


+
      return 0;
  }
@@ -11328,6 +11333,11 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
          struct bpf_func_state *state;
          struct bpf_reg_state *reg;
+        if (in_rbtree_lock_required_cb(env) && (rcu_lock || rcu_unlock)) { +            verbose(env, "Calling bpf_rcu_read_{lock,unlock} in unnecessary rbtree callback\n");
+            return -EACCES;
+        }
+
          if (rcu_lock) {
              verbose(env, "nested rcu read lock (kernel function %s)\n", func_name);
              return -EINVAL;
@@ -16689,7 +16699,8 @@ static int do_check(struct bpf_verifier_env *env)
                      return -EINVAL;
                  }
-                if (env->cur_state->active_rcu_lock) {
+                if (env->cur_state->active_rcu_lock &&
+                    !in_rbtree_lock_required_cb(env)) {
                      verbose(env, "bpf_rcu_read_unlock is missing\n");
                      return -EINVAL;
                  }





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux