Re: [PATCH v2 bpf-next 12/13] bpf: Introduce bpf_mem_free_rcu() similar to kfree_rcu().

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 6/28/2023 8:52 AM, Alexei Starovoitov wrote:
> On 6/23/23 11:49 PM, Hou Tao wrote:
>> Hi,
>>
>> On 6/24/2023 11:13 AM, Alexei Starovoitov wrote:
>>> From: Alexei Starovoitov <ast@xxxxxxxxxx>
>>>
>> SNIP
>>>   +static void __free_by_rcu(struct rcu_head *head)
>>> +{
>>> +    struct bpf_mem_cache *c = container_of(head, struct
>>> bpf_mem_cache, rcu);
>>> +    struct bpf_mem_cache *tgt = c->tgt;
>>> +    struct llist_node *llnode;
>>> +
>>> +    if (unlikely(READ_ONCE(c->draining)))
>>> +        goto out;
>>> +
>>> +    llnode = llist_del_all(&c->waiting_for_gp);
>>> +    if (!llnode)
>>> +        goto out;
>>> +
>>> +    if (llist_add_batch(llnode, c->waiting_for_gp_tail,
>>> &tgt->free_by_rcu_ttrace))
>>> +        tgt->free_by_rcu_ttrace_tail = c->waiting_for_gp_tail;
>> Got a null-ptr dereference oops when running multiple test_maps and
>> htab-mem benchmark after hacking htab to use bpf_mem_cache_free_rcu().
>> And I think it happened as follow:
>>
>> // c->tgt
>> P1: __free_by_rcu()
>>          // c->tgt is the same as P1
>>          P2: __free_by_rcu()
>>
>> // return true
>> P1: llist_add_batch(&tgt->free_by_rcu_ttrace)
>>          // return false
>>          P2: llist_add_batch(&tgt->free_by_rcu_ttrace)
>>          P2: do_call_rcu_ttrace
>>          // return false
>>          P2: xchg(tgt->call_rcu_ttrace_in_progress, 1)
>>          // llnode is not NULL
>>          P2: llnode = llist_del_all(&c->free_by_rcu_ttrace)
>>          // BAD: c->free_by_rcu_ttrace_tail is NULL, so oops
>>          P2: __llist_add_batch(llnode, c->free_by_rcu_ttrace_tail)
>>
>> P1: tgt->free_by_rcu_ttrace_tail = X
>>
>> I don't have a good fix for the problem except adding a spin-lock for
>> free_by_rcu_ttrace and free_by_rcu_ttrace_tail.
>
> null-ptr is probably something else, since the race window is
> extremely tiny.

The null-ptr dereference is indeed due to free_by_rcu_ttrace_tail is
NULL. The oops occurred multiple times and I have checked the vmcore to
confirm that.

> In my testing this optimization doesn't buy much.
> So I'll just drop _tail optimization and switch to for_each(del_all)
> to move elements. We can revisit later.

OK





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux