Re: [RFC PATCH bpf-next 08/11] bpf: Add OBJ_NON_OWNING_REF type flag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/22/22 11:34 AM, Dave Marchevsky wrote:
Consider a pointer to a type that would normally need acquire / release
semantics to be safely held. There may be scenarios where such a pointer
can be safely held without the need to acquire a reference.

For example, although a PTR_TO_BTF_ID for a rbtree_map node is released
via bpf_rbtree_add helper, the helper doesn't change the address of the
node and must be called with the rbtree_map's spinlock held. Since the
only way to remove a node from the rbtree - bpf_rbtree_remove helper -
requires the same lock, the newly-added node cannot be removed by a
concurrently-running program until the lock is released. Therefore it is
safe to hold a reference to this node until bpf_rbtree_unlock is called.

This patch introduces a new type flag and associated verifier logic to
handle such "non-owning" references.

Currently the only usecase I have is the rbtree example above, so the
verifier logic is straightforward:
   * Tag return types of bpf_rbtree_{add,find} with OBJ_NON_OWNING_REF
     * These both require the rbtree lock to be held to return anything
     non-NULL
     * Since ret type for both is PTR_TO_BTF_ID_OR_NULL, if lock is not
     held and NULL is returned, existing mark_ptr_or_null_reg logic
     will clear reg type.
     * So if mark_ptr_or_null_reg logic turns the returned reg into a
     PTR_TO_BTF_ID | OBJ_NON_OWNING_REF, verifier knows lock is held.

   * When the lock is released the verifier invalidates any regs holding
   non owning refs similarly to existing release_reference logic - but no
   need to clear ref_obj_id as an 'owning' reference was never acquired.

[ TODO: Currently the invalidation logic in
clear_rbtree_node_non_owning_refs is not parametrized by map so
unlocking any rbtree lock will invalidate all non-owning refs ]

probably should be parametrized by 'lock' instead of by 'map'.

Signed-off-by: Dave Marchevsky <davemarchevsky@xxxxxx>
---
  include/linux/bpf.h   |  1 +
  kernel/bpf/rbtree.c   |  4 +--
  kernel/bpf/verifier.c | 63 +++++++++++++++++++++++++++++++++++++++----
  3 files changed, 61 insertions(+), 7 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index eb8c550db0e2..c9c4b4fb019c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -412,6 +412,7 @@ enum bpf_type_flag {
  	/* Size is known at compile time. */
  	MEM_FIXED_SIZE		= BIT(10 + BPF_BASE_TYPE_BITS),
+ OBJ_NON_OWNING_REF = BIT(11 + BPF_BASE_TYPE_BITS),
  	__BPF_TYPE_FLAG_MAX,
  	__BPF_TYPE_LAST_FLAG	= __BPF_TYPE_FLAG_MAX - 1,
  };
diff --git a/kernel/bpf/rbtree.c b/kernel/bpf/rbtree.c
index 5b1ab73e164f..34864fc83209 100644
--- a/kernel/bpf/rbtree.c
+++ b/kernel/bpf/rbtree.c
@@ -111,7 +111,7 @@ BPF_CALL_3(bpf_rbtree_add, struct bpf_map *, map, void *, value, void *, cb)
  const struct bpf_func_proto bpf_rbtree_add_proto = {
  	.func = bpf_rbtree_add,
  	.gpl_only = true,
-	.ret_type = RET_PTR_TO_BTF_ID_OR_NULL,
+	.ret_type = RET_PTR_TO_BTF_ID_OR_NULL | OBJ_NON_OWNING_REF,
  	.arg1_type = ARG_CONST_MAP_PTR,
  	.arg2_type = ARG_PTR_TO_BTF_ID | OBJ_RELEASE,
  	.arg2_btf_id = &bpf_rbtree_btf_ids[0],
@@ -133,7 +133,7 @@ BPF_CALL_3(bpf_rbtree_find, struct bpf_map *, map, void *, key, void *, cb)
  const struct bpf_func_proto bpf_rbtree_find_proto = {
  	.func = bpf_rbtree_find,
  	.gpl_only = true,
-	.ret_type = RET_PTR_TO_BTF_ID_OR_NULL,
+	.ret_type = RET_PTR_TO_BTF_ID_OR_NULL | OBJ_NON_OWNING_REF,
  	.ret_btf_id = &bpf_rbtree_btf_ids[0],
  	.arg1_type = ARG_CONST_MAP_PTR,
  	.arg2_type = ARG_ANYTHING,

To prevent race the bpf_rbtree_find/add feels not enough.
bpf_rbtree_alloc_node probably needs it too?
Otherwise after bpf_rbtree_unlock the stashed pointer from alloc
will still be accessible?
May be it's actually ok to access by the prog?
Due to the way preallocated maps work the elements can be use-after-free
within the same map. It's similar to SLAB_TYPESAFE_BY_RCU.
The elements won't be released into global kernel memory while progs are looking at them.
It seems to me it's ok to do away without OBJ_NON_OWNING_REF concept.
The prog might have pointers to rbtree elements and they will be
accessible by the prog even after bpf_rbtree_unlock().
This access will be racy, but it's safe.
So just drop this patch?



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux