Re: [PATCH v2 bpf-next 09/13] bpf: Allow reuse from waiting_for_gp_ttrace list.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 6/26/2023 12:42 PM, Alexei Starovoitov wrote:
> On Sun, Jun 25, 2023 at 8:30 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
>> Hi,
>>
>> On 6/24/2023 11:13 AM, Alexei Starovoitov wrote:
>>> From: Alexei Starovoitov <ast@xxxxxxxxxx>
>>>
>>> alloc_bulk() can reuse elements from free_by_rcu_ttrace.
>>> Let it reuse from waiting_for_gp_ttrace as well to avoid unnecessary kmalloc().
>>>
>>> Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx>
>>> ---
>>>  kernel/bpf/memalloc.c | 9 +++++++++
>>>  1 file changed, 9 insertions(+)
>>>
>>> diff --git a/kernel/bpf/memalloc.c b/kernel/bpf/memalloc.c
>>> index 692a9a30c1dc..666917c16e87 100644
>>> --- a/kernel/bpf/memalloc.c
>>> +++ b/kernel/bpf/memalloc.c
>>> @@ -203,6 +203,15 @@ static void alloc_bulk(struct bpf_mem_cache *c, int cnt, int node)
>>>       if (i >= cnt)
>>>               return;
>>>
>>> +     for (; i < cnt; i++) {
>>> +             obj = llist_del_first(&c->waiting_for_gp_ttrace);
>> After allowing to reuse elements from waiting_for_gp_ttrace, there may
>> be concurrent llist_del_first() and llist_del_all() as shown below and
>> llist_del_first() is not safe because the elements freed from free_rcu()
>> could be reused immediately and head->first may be added back to
>> c0->waiting_for_gp_ttrace by other process.
>>
>> // c0
>> alloc_bulk()
>>     llist_del_first(&c->waiting_for_gp_ttrace)
>>
>> // c1->tgt = c0
>> free_rcu()
>>     llist_del_all(&c->waiting_for_gp_ttrace)
> I'm still thinking about how to fix the other issues you've reported,
> but this one, I believe, is fine.
> Are you basing 'not safe' on a comment?
> Why xchg(&head->first, NULL); on one cpu and
> try_cmpxchg(&head->first, &entry, next);
> is unsafe?
No, I didn't reason it only based on the comment in llist.h. The reason
I though it was not safe because llist_del_first() may have ABA problem
as pointed by you early, because after __free_rcu(), the elements on
waiting_for_gp_ttrace would be available immediately and
waiting_for_gp_ttrace->first may be reused and then added back to
waiting_for_gp_ttrace->first again as shown below.

// c0->waiting_for_gp_ttrace 
A -> B -> C -> nil 
 
P1: 
alloc_bulk(c0) 
    llist_del_first(&c0->waiting_for_gp_ttrace) 
        entry = A 
        next = B 
 
    P2: __free_rcu 
        // A/B/C is freed back to slab 
        // c0->waiting_for_gp_ttrace->first = NULL 
        free_all(llist_del_all(c0)) 
 
    P3: alloc_bulk(c1) 
        // got A 
        __kmalloc() 
 
    // free A (from c1), ..., last free X (allocated from c0) 
    P3: unit_free(c1)
        // the last freed element X is from c0
        c1->tgt = c0 
        c1->free_llist->first -> X -> Y -> ... -> A 
    P3: free_bulk(c1) 
        enque_to_free(c0) 
            c0->free_by_rcu_ttrace->first -> A -> ... -> Y -> X 
        __llist_add_batch(c0->waiting_for_gp_ttrace) 
            c0->waiting_for_gp_ttrace = A -> ... -> Y -> X 

P1: 
    // A is added back as first again
    // but llist_del_first() didn't know
    try_cmpxhg(&c0->waiting_for_gp_ttrace->first, A, B) 
    // c0->waiting_for_gp_trrace is corrupted 
    c0->waiting_for_gp_ttrace->first = B





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux