On Wed, Nov 8, 2023 at 11:26 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > > Hi, > > On 11/9/2023 2:36 PM, Martin KaFai Lau wrote: > > On 11/7/23 6:06 AM, Hou Tao wrote: > >> From: Hou Tao <houtao1@xxxxxxxxxx> > >> > >> bpf_map_of_map_fd_get_ptr() will convert the map fd to the pointer > >> saved in map-in-map. bpf_map_of_map_fd_put_ptr() will release the > >> pointer saved in map-in-map. These two helpers will be used by the > >> following patches to fix the use-after-free problems for map-in-map. > >> > >> Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx> > >> --- > >> kernel/bpf/map_in_map.c | 51 +++++++++++++++++++++++++++++++++++++++++ > >> kernel/bpf/map_in_map.h | 11 +++++++-- > >> 2 files changed, 60 insertions(+), 2 deletions(-) > >> > >> > SNIP > >> +void bpf_map_of_map_fd_put_ptr(void *ptr, bool need_defer) > >> +{ > >> + struct bpf_inner_map_element *element = ptr; > >> + > >> + /* Do bpf_map_put() after a RCU grace period and a tasks trace > >> + * RCU grace period, so it is certain that the bpf program which is > >> + * manipulating the map now has exited when bpf_map_put() is > >> called. > >> + */ > >> + if (need_defer) > > > > "need_defer" should only happen from the syscall cmd? Instead of > > adding rcu_head to each element, how about > > "synchronize_rcu_mult(call_rcu, call_rcu_tasks)" here? > > No. I have tried the method before, but it didn't work due to dead-lock > (will mention that in commit message in v2). The reason is that bpf > syscall program may also do map update through sys_bpf helper. Because > bpf syscall program is running with sleep-able context and has > rcu_read_lock_trace being held, so call synchronize_rcu_mult(call_rcu, > call_rcu_tasks) will lead to dead-lock. Dead-lock? why? I think it's legal to do call_rcu_tasks_trace() while inside RCU CS or RCU tasks trace CS.