Re: [RFC bpf-next v3 3/6] bpf: Introduce BPF_MA_REUSE_AFTER_RCU_GP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 5/4/2023 7:39 AM, Martin KaFai Lau wrote:
> On 5/3/23 4:06 PM, Alexei Starovoitov wrote:
>> On Wed, May 03, 2023 at 02:57:03PM -0700, Martin KaFai Lau wrote:
>>> On 5/3/23 11:48 AM, Alexei Starovoitov wrote:
SNIP
>>>
>>> If the bpf prog always does a bpf_rcu_read_lock() before accessing the
>>> (e.g.) task local storage, it can remove the reuse_now conditions in
>>> the
>>> bpf_local_storage and directly call the bpf_mem_cache_free().
>>>
>>> The only corner use case is when the bpf_prog or syscall does
>>> bpf_task_storage_delete() instead of having the task storage stays
>>> with the
>>> whole lifetime of the task_struct. Using REUSE_AFTER_RCU_GP will be
>>> a change
>>> of this uaf guarantee to the sleepable program but it is still safe
>>> because
>>> it is freed after tasks_trace gp. We could take this chance to align
>>> this
>>> behavior of the local storage map to the other bpf maps.
>>>
>>> For BPF_MA_FREE_AFTER_RCU_GP, there are cases that the bpf local
>>> storage
>>> knows it can be freed without waiting tasks_trace gp. However, only
>>> task/cgroup storages are in bpf ma and I don't believe this
>>> optimization
>>> matter much for them. I would rather focus on the REUSE_AFTER_RCU_GP
>>> first.
OK.
>>
>> I'm confused which REUSE_AFTER_RCU_GP you meant.
>> What I proposed above is
>> REUSE_AFTER_rcu_GP_and_free_after_rcu_tasks_trace
>
> Regarding REUSE_AFTER_RCU_GP, I meant
> REUSE_AFTER_rcu_GP_and_free_after_rcu_tasks_trace.
>
>>
>> Hou's proposals: 1. BPF_MA_REUSE_AFTER_two_RCUs_GP 2.
>> BPF_MA_FREE_AFTER_single_RCU_GP
>
> It probably is where the confusion is. I thought Hou's
> BPF_MA_REUSE_AFTER_RCU_GP is already
> REUSE_AFTER_rcu_GP_and_free_after_rcu_tasks_trace. From the commit
> message:
>
> " ... So introduce BPF_MA_REUSE_AFTER_RCU_GP to solve these problems. For
> BPF_MA_REUSE_AFTER_GP, the freed objects are reused only after one RCU
> grace period and may be returned back to slab system after another
> RCU-tasks-trace grace period. ..."
>
> [I assumed BPF_MA_REUSE_AFTER_GP is just a typo of
> BPF_MA_REUSE_AFTER_"RCU"_GP]
Yes. Now the implementation of BPF_MA_REUSE_AFTER_RCU_GP is already
being REUSE_AFTER_rcu_GP_and_free_after_rcu_tasks_trace. It moves the
free objects to reuse_ready_head list after one RCU GP, splices the
elements in reuse_ready_head to wait_for_free when reuse_ready_head is
not empty and frees these elements in wait_for_free by
call_rcu_tasks_trace().
>
>>
>> If I'm reading bpf_local_storage correctly it can remove reuse_now logic
>> in all conditions with
>> REUSE_AFTER_rcu_GP_and_free_after_rcu_tasks_trace.
>
> Right, for smap->bpf_ma == true (cgroup and task storage), all
> reuse_now logic can be gone and directly use the bpf_mem_cache_free().
> Potentially the sk/inode can also move to bpf_ma after running some
> benchmark. This will simplify things a lot. For sk storage, the
> reuse_now was there to avoid the unnecessary tasks_trace gp because
> performance impact was reported on sk storage where connections can be
> open-and-close very frequently.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux