On Fri, Jul 15, 2022 at 01:49:28PM +0200, Kumar Kartikeya Dwivedi wrote: > On Thu, 14 Jul 2022 at 19:33, Dave Marchevsky <davemarchevsky@xxxxxx> wrote: > > > > On 7/14/22 2:30 AM, Kumar Kartikeya Dwivedi wrote: > > > On Thu, 14 Jul 2022 at 01:46, Dave Marchevsky <davemarchevsky@xxxxxx> wrote: > > >> > > >> The may_be_acquire_function check is a weaker version of > > >> is_acquire_function that only uses bpf_func_id to determine whether a > > >> func may be acquiring a reference. Most funcs which acquire a reference > > >> do so regardless of their input, so bpf_func_id is all that's necessary > > >> to make an accurate determination. However, map_lookup_elem only > > >> acquires when operating on certain MAP_TYPEs, so commit 64d85290d79c > > >> ("bpf: Allow bpf_map_lookup_elem for SOCKMAP and SOCKHASH") added the > > >> may_be check. > > >> > > >> Any helper which always acquires a reference should pass both > > >> may_be_acquire_function and is_acquire_function checks. Recently-added > > >> kptr_xchg passes the latter but not the former. This patch resolves this > > >> discrepancy and does some refactoring such that the list of functions > > >> which always acquire is in one place so future updates are in sync. > > >> > > > > > > Thanks for the fix. > > > I actually didn't add this on purpose, because the reason for using > > > the may_be_acquire_function (in check_refcount_ok) doesn't apply to > > > kptr_xchg, but maybe that was a poor choice on my part. I'm actually > > > not sure of the need for may_be_acquire_function, and > > > check_refcount_ok. > > > > > > Can we revisit why iit is needed? It only prevents > > > ARG_PTR_TO_SOCK_COMMON (which is not the only arg type that may be > > > refcounted) from being argument type of acquire functions. What is the > > > reason behind this? Should we rename arg_type_may_be_refcounted to a > > > less confusing name? It probably only applies to socket lookup > > > helpers. > > > > > > > I'm just starting to dive into this reference acquire/release stuff, so I was > > also hoping someone could clarify the semantics here :). > > > > Seems like the purpose of check_refcount_ok is to 1) limit helpers to one > > refcounted arg - currently determined by arg_type_may_be_refcounted, which was > > added as arg_type_is_refcounted in [0]; and 2) disallow helpers which acquire > > I think this is already prevented in check_func_arg when it sees > meta->ref_obj_id already set, which is the more correct way to do it > instead of looking at argument types alone. > > > a reference from taking refcounted args. The reasoning behind 2) isn't clear to > > me but my best guess based on [1] is that there's some delineation between > > "helpers which cast a refcounted thing but don't acquire" and helpers that > > acquire. > > > > Maybe we can add similar type tags to OBJ_RELEASE, which you added in > > [2], to tag args which are casted in this manner and avoid hardcoding > > ARG_PTR_TO_SOCK_COMMON. Or at least rename arg_type_may_be_refcounted now that > > other things may be refcounted but don't need similar casting treatment. > > > > IMO there isn't any problem per se in an acquire function taking > refcounted argument, so maybe it was just a precautionary check back > then. I think the intention was that ARG_PTR_TO_SOCK_COMMON is never > an argument type of an acquire function, but I don't know why. The > commit [1] says: > > - check_refcount_ok() ensures is_acquire_function() cannot take > arg_type_may_be_refcounted() as its argument. > > Back then only socket lookup functions were acquire functions. > Maybe Martin can shed some light as to why it was the case and whether > this check is needed. Trying to recall from the log history. I believe {is,may_be}_acquire_function() in check_refcount_ok() was for pre-cautionary. At that time, in check_helper_call(), regs[BPF_REG_0].ref_obj_id was handled for is_acquire_function() first before is_ptr_cast_function(). If somehow a func proto could do both 'acquire' and 'ptr_cast', the ref_obj_id tracking will break. May be a better check was to ensure a func cannot be both acquire and ptr_cast. This ordering in check_helper_call() is reversed now though, so I think may_be_acquire_function() can be removed from check_refcount_ok(). Regarding to the original 'return count <= 1' check in check_refcount_ok(), I believe the idea was to ensure the release func_proto only takes one arg that is refcnt-ed. [bpf_sk_release was the only release func proto]. Otherwise, if a release func can take two args that could have ref_obj_id, the meta->ref_obj_id tracking will not work. Think about only one of them has ref_obj_id and it is not the arg that the func proto is actually releasing. Since there is OBJ_RELEASE now and sk is not the only refcnt type, check_refcount_ok() may as well count by OBJ_RELEASE instead of arg-type. > > > [0]: fd978bf7fd31 ("bpf: Add reference tracking to verifier") > > [1]: 1b986589680a ("bpf: Fix bpf_tcp_sock and bpf_sk_fullsock issue related to bpf_sk_release") > > [2]: 8f14852e8911 ("bpf: Tag argument to be released in bpf_func_proto") > > > > >> Fixes: c0a5a21c25f3 ("bpf: Allow storing referenced kptr in map") > > >> Signed-off-by: Dave Marchevsky <davemarchevsky@xxxxxx> > > >> --- > > >> > > >> Sent to bpf-next instead of bpf as kptr_xchg not passing > > >> may_be_acquire_function isn't currently breaking anything, just > > >> logically inconsistent. > > >> > > >> kernel/bpf/verifier.c | 33 +++++++++++++++++++++++---------- > > >> 1 file changed, 23 insertions(+), 10 deletions(-) > > >> > > >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > >> index 26e7e787c20a..df4b923e77de 100644 > > >> --- a/kernel/bpf/verifier.c > > >> +++ b/kernel/bpf/verifier.c > > >> @@ -477,13 +477,30 @@ static bool type_may_be_null(u32 type) > > >> return type & PTR_MAYBE_NULL; > > >> } > > >> > > >> +/* These functions acquire a resource that must be later released > > >> + * regardless of their input > > >> + */ > > >> +static bool __check_function_always_acquires(enum bpf_func_id func_id) > > >> +{ > > >> + switch (func_id) { > > >> + case BPF_FUNC_sk_lookup_tcp: > > >> + case BPF_FUNC_sk_lookup_udp: > > >> + case BPF_FUNC_skc_lookup_tcp: > > >> + case BPF_FUNC_ringbuf_reserve: > > >> + case BPF_FUNC_kptr_xchg: > > >> + return true; > > >> + default: > > >> + return false; > > >> + } > > >> +} > > >> + > > >> static bool may_be_acquire_function(enum bpf_func_id func_id) > > >> { > > >> - return func_id == BPF_FUNC_sk_lookup_tcp || > > >> - func_id == BPF_FUNC_sk_lookup_udp || > > >> - func_id == BPF_FUNC_skc_lookup_tcp || > > >> - func_id == BPF_FUNC_map_lookup_elem || > > >> - func_id == BPF_FUNC_ringbuf_reserve; > > >> + /* See is_acquire_function for the conditions under which funcs > > >> + * not in __check_function_always_acquires acquire a resource > > >> + */ > > >> + return __check_function_always_acquires(func_id) || > > >> + func_id == BPF_FUNC_map_lookup_elem; > > >> } > > >> > > >> static bool is_acquire_function(enum bpf_func_id func_id, > > >> @@ -491,11 +508,7 @@ static bool is_acquire_function(enum bpf_func_id func_id, > > >> { > > >> enum bpf_map_type map_type = map ? map->map_type : BPF_MAP_TYPE_UNSPEC; > > >> > > >> - if (func_id == BPF_FUNC_sk_lookup_tcp || > > >> - func_id == BPF_FUNC_sk_lookup_udp || > > >> - func_id == BPF_FUNC_skc_lookup_tcp || > > >> - func_id == BPF_FUNC_ringbuf_reserve || > > >> - func_id == BPF_FUNC_kptr_xchg) > > >> + if (__check_function_always_acquires(func_id)) > > >> return true; > > >> > > >> if (func_id == BPF_FUNC_map_lookup_elem && > > >> -- > > >> 2.30.2 > > >>