On 6/23/23 11:49 PM, Hou Tao wrote:
Hi,
On 6/24/2023 11:13 AM, Alexei Starovoitov wrote:
From: Alexei Starovoitov <ast@xxxxxxxxxx>
SNIP
+static void __free_by_rcu(struct rcu_head *head)
+{
+ struct bpf_mem_cache *c = container_of(head, struct bpf_mem_cache, rcu);
+ struct bpf_mem_cache *tgt = c->tgt;
+ struct llist_node *llnode;
+
+ if (unlikely(READ_ONCE(c->draining)))
+ goto out;
+
+ llnode = llist_del_all(&c->waiting_for_gp);
+ if (!llnode)
+ goto out;
+
+ if (llist_add_batch(llnode, c->waiting_for_gp_tail, &tgt->free_by_rcu_ttrace))
+ tgt->free_by_rcu_ttrace_tail = c->waiting_for_gp_tail;
Got a null-ptr dereference oops when running multiple test_maps and
htab-mem benchmark after hacking htab to use bpf_mem_cache_free_rcu().
And I think it happened as follow:
// c->tgt
P1: __free_by_rcu()
// c->tgt is the same as P1
P2: __free_by_rcu()
// return true
P1: llist_add_batch(&tgt->free_by_rcu_ttrace)
// return false
P2: llist_add_batch(&tgt->free_by_rcu_ttrace)
P2: do_call_rcu_ttrace
// return false
P2: xchg(tgt->call_rcu_ttrace_in_progress, 1)
// llnode is not NULL
P2: llnode = llist_del_all(&c->free_by_rcu_ttrace)
// BAD: c->free_by_rcu_ttrace_tail is NULL, so oops
P2: __llist_add_batch(llnode, c->free_by_rcu_ttrace_tail)
P1: tgt->free_by_rcu_ttrace_tail = X
I don't have a good fix for the problem except adding a spin-lock for
free_by_rcu_ttrace and free_by_rcu_ttrace_tail.
null-ptr is probably something else, since the race window is
extremely tiny.
In my testing this optimization doesn't buy much.
So I'll just drop _tail optimization and switch to for_each(del_all)
to move elements. We can revisit later.