On Mon, Jul 18, 2022 at 5:36 PM Martin KaFai Lau <kafai@xxxxxx> wrote: > > On Fri, Jul 15, 2022 at 01:49:28PM +0200, Kumar Kartikeya Dwivedi wrote: > > On Thu, 14 Jul 2022 at 19:33, Dave Marchevsky <davemarchevsky@xxxxxx> wrote: > > > > > > On 7/14/22 2:30 AM, Kumar Kartikeya Dwivedi wrote: > > > > On Thu, 14 Jul 2022 at 01:46, Dave Marchevsky <davemarchevsky@xxxxxx> wrote: > > > >> > > > >> The may_be_acquire_function check is a weaker version of > > > >> is_acquire_function that only uses bpf_func_id to determine whether a > > > >> func may be acquiring a reference. Most funcs which acquire a reference > > > >> do so regardless of their input, so bpf_func_id is all that's necessary > > > >> to make an accurate determination. However, map_lookup_elem only > > > >> acquires when operating on certain MAP_TYPEs, so commit 64d85290d79c > > > >> ("bpf: Allow bpf_map_lookup_elem for SOCKMAP and SOCKHASH") added the > > > >> may_be check. > > > >> > > > >> Any helper which always acquires a reference should pass both > > > >> may_be_acquire_function and is_acquire_function checks. Recently-added > > > >> kptr_xchg passes the latter but not the former. This patch resolves this > > > >> discrepancy and does some refactoring such that the list of functions > > > >> which always acquire is in one place so future updates are in sync. > > > >> > > > > > > > > Thanks for the fix. > > > > I actually didn't add this on purpose, because the reason for using > > > > the may_be_acquire_function (in check_refcount_ok) doesn't apply to > > > > kptr_xchg, but maybe that was a poor choice on my part. I'm actually > > > > not sure of the need for may_be_acquire_function, and > > > > check_refcount_ok. > > > > > > > > Can we revisit why iit is needed? It only prevents > > > > ARG_PTR_TO_SOCK_COMMON (which is not the only arg type that may be > > > > refcounted) from being argument type of acquire functions. What is the > > > > reason behind this? Should we rename arg_type_may_be_refcounted to a > > > > less confusing name? It probably only applies to socket lookup > > > > helpers. > > > > > > > > > > I'm just starting to dive into this reference acquire/release stuff, so I was > > > also hoping someone could clarify the semantics here :). > > > > > > Seems like the purpose of check_refcount_ok is to 1) limit helpers to one > > > refcounted arg - currently determined by arg_type_may_be_refcounted, which was > > > added as arg_type_is_refcounted in [0]; and 2) disallow helpers which acquire > > > > I think this is already prevented in check_func_arg when it sees > > meta->ref_obj_id already set, which is the more correct way to do it > > instead of looking at argument types alone. > > > > > a reference from taking refcounted args. The reasoning behind 2) isn't clear to > > > me but my best guess based on [1] is that there's some delineation between > > > "helpers which cast a refcounted thing but don't acquire" and helpers that > > > acquire. > > > > > > Maybe we can add similar type tags to OBJ_RELEASE, which you added in > > > [2], to tag args which are casted in this manner and avoid hardcoding > > > ARG_PTR_TO_SOCK_COMMON. Or at least rename arg_type_may_be_refcounted now that > > > other things may be refcounted but don't need similar casting treatment. > > > > > > > IMO there isn't any problem per se in an acquire function taking > > refcounted argument, so maybe it was just a precautionary check back > > then. I think the intention was that ARG_PTR_TO_SOCK_COMMON is never > > an argument type of an acquire function, but I don't know why. The > > commit [1] says: > > > > - check_refcount_ok() ensures is_acquire_function() cannot take > > arg_type_may_be_refcounted() as its argument. > > > > Back then only socket lookup functions were acquire functions. > > Maybe Martin can shed some light as to why it was the case and whether > > this check is needed. > Trying to recall from the log history. > > I believe {is,may_be}_acquire_function() in check_refcount_ok() was for > pre-cautionary. At that time, in check_helper_call(), > regs[BPF_REG_0].ref_obj_id was handled for is_acquire_function() first > before is_ptr_cast_function(). If somehow a func proto could do > both 'acquire' and 'ptr_cast', the ref_obj_id tracking will break. > May be a better check was to ensure a func cannot be both acquire and > ptr_cast. This ordering in check_helper_call() is reversed now though, > so I think may_be_acquire_function() can be removed from check_refcount_ok(). Makes sense to me. Let's do that instead. > Regarding to the original 'return count <= 1' check in check_refcount_ok(), > I believe the idea was to ensure the release func_proto only takes one > arg that is refcnt-ed. [bpf_sk_release was the only release func proto]. > Otherwise, if a release func can take two args that could have ref_obj_id, > the meta->ref_obj_id tracking will not work. Think about only one of them > has ref_obj_id and it is not the arg that the func proto is actually releasing. > Since there is OBJ_RELEASE now and sk is not the only refcnt type, > check_refcount_ok() may as well count by OBJ_RELEASE instead of > arg-type. +1. It also would be a good cleanup.