On Mon, Jun 29, 2020 at 06:08:48PM -0700, Andrii Nakryiko wrote: > On Mon, Jun 29, 2020 at 5:58 PM Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > On Mon, Jun 29, 2020 at 5:35 PM Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > From: Alexei Starovoitov <ast@xxxxxxxxxx> > > > > > > bpf_free_used_maps() or close(map_fd) will trigger map_free callback. > > > bpf_free_used_maps() is called after bpf prog is no longer executing: > > > bpf_prog_put->call_rcu->bpf_prog_free->bpf_free_used_maps. > > > Hence there is no need to call synchronize_rcu() to protect map elements. > > > > > > Signed-off-by: Alexei Starovoitov <ast@xxxxxxxxxx> > > > --- > > > > Seems correct. And nice that maps don't have to care about this anymore. > > > > Actually, what about the map-in-map case? > > What if you had an array-of-maps with an inner map element. It is the > last reference to that map. Now you have two BPF prog executions in > parallel. One looked up that inner map and is updating it at the > moment. Another execution at the same time deletes that map. That > deletion will call bpf_map_put(), which without synchronize_rcu() will > free memory. All the while the former BPF program execution is still > working with that map. The delete of that inner map can only be done via sys_bpf() and there we do maybe_wait_bpf_programs() exactly to avoid this kind of problems. It's also necessary for user space. When the user is doing map_update/delete of inner map as soon as syscall returns the user can process old map with guarantees that no bpf prog is touching inner map.