On Wed, 10 Aug 2022 at 02:50, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Tue, Aug 9, 2022 at 3:29 PM Martin KaFai Lau <kafai@xxxxxx> wrote: > > > > On Tue, Aug 09, 2022 at 11:30:32PM +0200, Kumar Kartikeya Dwivedi wrote: > > > The LRU map that is preallocated may have its elements reused while > > > another program holds a pointer to it from bpf_map_lookup_elem. Hence, > > > only check_and_free_fields is appropriate when the element is being > > > deleted, as it ensures proper synchronization against concurrent access > > > of the map value. After that, we cannot call check_and_init_map_value > > > again as it may rewrite bpf_spin_lock, bpf_timer, and kptr fields while > > > they can be concurrently accessed from a BPF program. > > > > > > This is safe to do as when the map entry is deleted, concurrent access > > > is protected against by check_and_free_fields, i.e. an existing timer > > > would be freed, and any existing kptr will be released by it. The > > > program can create further timers and kptrs after check_and_free_fields, > > > but they will eventually be released once the preallocated items are > > > freed on map destruction, even if the item is never reused again. Hence, > > > the deleted item sitting in the free list can still have resources > > > attached to it, and they would never leak. > > > > > > With spin_lock, we never touch the field at all on delete or update, as > > > we may end up modifying the state of the lock. Since the verifier > > > ensures that a bpf_spin_lock call is always paired with bpf_spin_unlock > > > call, the program will eventually release the lock so that on reuse the > > > new user of the value can take the lock. > > The bpf_spin_lock's verifier description makes sense. Note that > > the lru map does not support spin lock for now. > > ahh. then it's not a bpf tree material. > It's a minor cleanup for bpf-next? > I was just describing what we do for each of the three types in the commit log. It still affects timers and kptrs, which lru map supports. > > > > > > Essentially, for the preallocated case, we must assume that the map > > > value may always be in use by the program, even when it is sitting in > > > the freelist, and handle things accordingly, i.e. use proper > > > synchronization inside check_and_free_fields, and never reinitialize the > > > special fields when it is reused on update. > > Acked-by: Martin KaFai Lau <kafai@xxxxxx>