On Tue, Aug 22, 2023 at 11:26 PM Yonghong Song <yonghong.song@xxxxxxxxx> wrote: > > > > On 8/21/23 12:33 PM, Dave Marchevsky wrote: > > This is the final fix for the use-after-free scenario described in > > commit 7793fc3babe9 ("bpf: Make bpf_refcount_acquire fallible for > > non-owning refs"). That commit, by virtue of changing > > bpf_refcount_acquire's refcount_inc to a refcount_inc_not_zero, fixed > > the "refcount incr on 0" splat. The not_zero check in > > refcount_inc_not_zero, though, still occurs on memory that could have > > been free'd and reused, so the commit didn't properly fix the root > > cause. > > > > This patch actually fixes the issue by free'ing using the recently-added > > bpf_mem_free_rcu, which ensures that the memory is not reused until > > RCU grace period has elapsed. If that has happened then > > there are no non-owning references alive that point to the > > recently-free'd memory, so it can be safely reused. > > > > Signed-off-by: Dave Marchevsky <davemarchevsky@xxxxxx> > > Acked-by: Yonghong Song <yonghong.song@xxxxxxxxx> > > --- > > kernel/bpf/helpers.c | 6 +++++- > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > > index eb91cae0612a..945a85e25ac5 100644 > > --- a/kernel/bpf/helpers.c > > +++ b/kernel/bpf/helpers.c > > @@ -1913,7 +1913,11 @@ void __bpf_obj_drop_impl(void *p, const struct btf_record *rec) > > > > if (rec) > > bpf_obj_free_fields(rec, p); > > During reviewing my percpu kptr patch with link > > https://lore.kernel.org/bpf/20230814172809.1361446-1-yonghong.song@xxxxxxxxx/T/#m2f7631b8047e9f5da60a0a9cd8717fceaf1adbb7 > Kumar mentioned although percpu memory itself is freed under rcu. > But its record fields are freed immediately. This will cause > the problem since there may be some active uses of these fields > within rcu cs and after bpf_obj_free_fields(), some fields may > be re-initialized with new memory but they do not have chances > to free any more. > > Do we have problem here as well? I think it's not an issue here or in your percpu patch, since bpf_obj_free_fields() calls __bpf_obj_drop_impl() which will call bpf_mem_free_rcu() (after this patch set lands). In other words all bpf pointers are either properly life time checked through the verifier and freed via immediate bpf_mem_free() or they're bpf_mem_free_rcu(). > > I am thinking whether I could create another flavor of bpf_mem_free_rcu > with a pre_free_callback function, something like > bpf_mem_free_rcu_cb2(struct bpf_mem_alloc *ma, void *ptr, > void (*cb)(void *, void *), void *arg1, void *arg2) > > The cb(arg1, arg2) will be called right before the real free of "ptr". > > For example, for this patch, the callback function can be > > static bpf_obj_free_fields_cb(void *rec, void *p) > { > if (rec) > bpf_obj_free_fields(rec, p); > /* we need to ensure recursive freeing fields free > * needs to be done immediately, which means we will > * add a parameter to __bpf_obj_drop_impl() to > * indicate whether bpf_mem_free or bpf_mem_free_rcu > * should be called. > */ > } > > bpf_mem_free_rcu_cb2(&bpf_global_ma, p, bpf_obj_free_fields_cb, rec, p); It sounds nice in theory, but will be difficult to implement. bpf_ma would need to add extra two 8 byte pointers for every object, but there is no room at present. So to free after rcu gp with a callback it would need to allocate 16 byte (at least) which might fail and so on. bpf_mem_free_rcu_cb2() would be the last resort.