Hi, On 6/28/2023 8:52 AM, Alexei Starovoitov wrote: > On 6/23/23 11:49 PM, Hou Tao wrote: >> Hi, >> >> On 6/24/2023 11:13 AM, Alexei Starovoitov wrote: >>> From: Alexei Starovoitov <ast@xxxxxxxxxx> >>> >> SNIP >>> +static void __free_by_rcu(struct rcu_head *head) >>> +{ >>> + struct bpf_mem_cache *c = container_of(head, struct >>> bpf_mem_cache, rcu); >>> + struct bpf_mem_cache *tgt = c->tgt; >>> + struct llist_node *llnode; >>> + >>> + if (unlikely(READ_ONCE(c->draining))) >>> + goto out; >>> + >>> + llnode = llist_del_all(&c->waiting_for_gp); >>> + if (!llnode) >>> + goto out; >>> + >>> + if (llist_add_batch(llnode, c->waiting_for_gp_tail, >>> &tgt->free_by_rcu_ttrace)) >>> + tgt->free_by_rcu_ttrace_tail = c->waiting_for_gp_tail; >> Got a null-ptr dereference oops when running multiple test_maps and >> htab-mem benchmark after hacking htab to use bpf_mem_cache_free_rcu(). >> And I think it happened as follow: >> >> // c->tgt >> P1: __free_by_rcu() >> // c->tgt is the same as P1 >> P2: __free_by_rcu() >> >> // return true >> P1: llist_add_batch(&tgt->free_by_rcu_ttrace) >> // return false >> P2: llist_add_batch(&tgt->free_by_rcu_ttrace) >> P2: do_call_rcu_ttrace >> // return false >> P2: xchg(tgt->call_rcu_ttrace_in_progress, 1) >> // llnode is not NULL >> P2: llnode = llist_del_all(&c->free_by_rcu_ttrace) >> // BAD: c->free_by_rcu_ttrace_tail is NULL, so oops >> P2: __llist_add_batch(llnode, c->free_by_rcu_ttrace_tail) >> >> P1: tgt->free_by_rcu_ttrace_tail = X >> >> I don't have a good fix for the problem except adding a spin-lock for >> free_by_rcu_ttrace and free_by_rcu_ttrace_tail. > > null-ptr is probably something else, since the race window is > extremely tiny. The null-ptr dereference is indeed due to free_by_rcu_ttrace_tail is NULL. The oops occurred multiple times and I have checked the vmcore to confirm that. > In my testing this optimization doesn't buy much. > So I'll just drop _tail optimization and switch to for_each(del_all) > to move elements. We can revisit later. OK