Hi, On 6/24/2023 11:13 AM, Alexei Starovoitov wrote: > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > Introduce bpf_mem_[cache_]free_rcu() similar to kfree_rcu(). > Unlike bpf_mem_[cache_]free() that links objects for immediate reuse into > per-cpu free list the _rcu() flavor waits for RCU grace period and then moves > objects into free_by_rcu_ttrace list where they are waiting for RCU > task trace grace period to be freed into slab. > SNIP > +static void check_free_by_rcu(struct bpf_mem_cache *c) > +{ > + struct llist_node *llnode, *t; > + > + if (llist_empty(&c->free_by_rcu) && llist_empty(&c->free_llist_extra_rcu)) > + return; > + > + /* drain free_llist_extra_rcu */ > + llist_for_each_safe(llnode, t, llist_del_all(&c->free_llist_extra_rcu)) > + if (__llist_add(llnode, &c->free_by_rcu)) > + c->free_by_rcu_tail = llnode; Just like add_obj_to_free_list(), we should do conditional local_irq_save(flags) and local_inc_return(&c->active) as well for free_by_rcu, otherwise free_by_rcu may be corrupted by bpf program running in a NMI context. > + > + if (atomic_xchg(&c->call_rcu_in_progress, 1)) { > + /* > + * Instead of kmalloc-ing new rcu_head and triggering 10k > + * call_rcu() to hit rcutree.qhimark and force RCU to notice > + * the overload just ask RCU to hurry up. There could be many > + * objects in free_by_rcu list. > + * This hint reduces memory consumption for an artifical > + * benchmark from 2 Gbyte to 150 Mbyte. > + */ > + rcu_request_urgent_qs_task(current); > + return; > + } > + > + WARN_ON_ONCE(!llist_empty(&c->waiting_for_gp)); > + > + WRITE_ONCE(c->waiting_for_gp.first, __llist_del_all(&c->free_by_rcu)); > + c->waiting_for_gp_tail = c->free_by_rcu_tail; > + call_rcu_hurry(&c->rcu, __free_by_rcu); The same protection is needed for c->free_by_rcu_tail as well.