Re: [RFC bpf-next v3 3/6] bpf: Introduce BPF_MA_REUSE_AFTER_RCU_GP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/3/23 4:06 PM, Alexei Starovoitov wrote:
On Wed, May 03, 2023 at 02:57:03PM -0700, Martin KaFai Lau wrote:
On 5/3/23 11:48 AM, Alexei Starovoitov wrote:
What it means that sleepable progs using hashmap will be able to avoid uaf with bpf_rcu_read_lock().
Without explicit bpf_rcu_read_lock() it's still safe and equivalent to existing behavior of bpf_mem_alloc.
(while your proposed BPF_MA_FREE_AFTER_RCU_GP flavor is not safe to use in hashtab with sleepable progs)

After that we can unconditionally remove rcu_head/call_rcu from bpf_cpumask and improve usability of bpf_obj_drop.
Probably usage of bpf_mem_alloc in local storage can be simplified as well.
Martin wdyt?

If the bpf prog always does a bpf_rcu_read_lock() before accessing the
(e.g.) task local storage, it can remove the reuse_now conditions in the
bpf_local_storage and directly call the bpf_mem_cache_free().

The only corner use case is when the bpf_prog or syscall does
bpf_task_storage_delete() instead of having the task storage stays with the
whole lifetime of the task_struct. Using REUSE_AFTER_RCU_GP will be a change
of this uaf guarantee to the sleepable program but it is still safe because
it is freed after tasks_trace gp. We could take this chance to align this
behavior of the local storage map to the other bpf maps.

For BPF_MA_FREE_AFTER_RCU_GP, there are cases that the bpf local storage
knows it can be freed without waiting tasks_trace gp. However, only
task/cgroup storages are in bpf ma and I don't believe this optimization
matter much for them. I would rather focus on the REUSE_AFTER_RCU_GP first.

I'm confused which REUSE_AFTER_RCU_GP you meant.
What I proposed above is REUSE_AFTER_rcu_GP_and_free_after_rcu_tasks_trace

Regarding REUSE_AFTER_RCU_GP, I meant REUSE_AFTER_rcu_GP_and_free_after_rcu_tasks_trace.


Hou's proposals: 1. BPF_MA_REUSE_AFTER_two_RCUs_GP 2. BPF_MA_FREE_AFTER_single_RCU_GP

It probably is where the confusion is. I thought Hou's BPF_MA_REUSE_AFTER_RCU_GP is already REUSE_AFTER_rcu_GP_and_free_after_rcu_tasks_trace. From the commit message:

" ... So introduce BPF_MA_REUSE_AFTER_RCU_GP to solve these problems. For
BPF_MA_REUSE_AFTER_GP, the freed objects are reused only after one RCU
grace period and may be returned back to slab system after another
RCU-tasks-trace grace period. ..."

[I assumed BPF_MA_REUSE_AFTER_GP is just a typo of BPF_MA_REUSE_AFTER_"RCU"_GP]


If I'm reading bpf_local_storage correctly it can remove reuse_now logic
in all conditions with REUSE_AFTER_rcu_GP_and_free_after_rcu_tasks_trace.

Right, for smap->bpf_ma == true (cgroup and task storage), all reuse_now logic can be gone and directly use the bpf_mem_cache_free(). Potentially the sk/inode can also move to bpf_ma after running some benchmark. This will simplify things a lot. For sk storage, the reuse_now was there to avoid the unnecessary tasks_trace gp because performance impact was reported on sk storage where connections can be open-and-close very frequently.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux