Hi, On 2/17/2023 12:35 AM, Alexei Starovoitov wrote: > On Thu, Feb 16, 2023 at 5:55 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: >> Beside BPF_REUSE_AFTER_RCU_GP, is BPF_FREE_AFTER_RCU_GP a feasible solution ? > The idea is for bpf_mem_free to wait normal RCU GP before adding > the elements back to the free list and free the elem to global kernel memory > only after both rcu and rcu_tasks_trace GPs as it's doing now. > >> Its downside is that it will enforce sleep-able program to use >> bpf_rcu_read_{lock,unlock}() to access these returned pointers ? > sleepable can access elems without kptrs/spin_locks > even when not using rcu_read_lock, since it's safe, but there is uaf. > Some progs might be fine with it. > When sleepable needs to avoid uaf they will use bpf_rcu_read_lock. Thanks for the explanation for BPF_REUSE_AFTER_RCU_GP. It seems that BPF_REUSE_AFTER_RCU_GP may incur OOM easily, because before the expiration of one RCU GP, these freed elements will not available to both bpf ma or slab subsystem and after the expiration of RCU GP, these freed elements are only available for one bpf ma but the number of these freed elements maybe too many for one bpf ma, so part of these freed elements will be freed through call_rcu_tasks_trace() and these freed-again elements will not be available for slab subsystem untill the expiration of tasks trace RCU. In brief, after one RCU GP, part of these freed elements will be reused, but the majority of these elements will still be freed through call_rcu_tasks_trace(). Due to the doubt above, I proposed BPF_FREE_AFTER_RCU to directly free these elements after one RCU GP and enforce sleepable program to use bpf_rcu_read_lock() to access these elements, but the enforcement will break the existing sleepable programs, so BPF_FREE_AFTER_GP is still not a good idea. I will check whether or not these is still OOM risk for BPF_REUSE_AFTER_RCU_GP and try to mitigate if it is possible (e.g., share these freed elements between all bpf ma instead of one bpf ma which free it).