On Tue, 30 Aug 2022 at 01:45, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Mon, Aug 29, 2022 at 4:18 PM Delyan Kratunov <delyank@xxxxxx> wrote: > > > > > > > > It is not very precise, but until those maps are gone it delays > > > release of the allocator (we can empty all percpu caches to save > > > memory once bpf_map pinning the allocator is gone, because allocations > > > are not going to be served). But it allows unit_free to be relatively > > > less costly as long as those 'candidate' maps are around. > > > > Yes, we considered this but it's much easier to get to pathological behaviors, by > > just loading and unloading programs that can access an allocator in a loop. The > > freelists being empty help but it's still quite easy to hold a lot of memory for > > nothing. > > > > The pointer walk was proposed to prune most such pathological cases while still being > > conservative enough to be easy to implement. Only races with the pointer walk can > > extend the lifetime unnecessarily. > > I'm getting lost in this thread. > > Here is my understanding so far: > We don't free kernel kptrs from map in release_uref, > but we should for local kptrs, since such objs are > not much different from timers. > So release_uref will xchg all such kptrs and free them > into the allocator without touching allocator's refcnt. > So there is no concurrency issue that Kumar was concerned about. Haven't really thought through whether this will fix the concurrent kptr swap problem, but then with this I think you need: - New helper bpf_local_kptr_xchg(map, map_value, kptr) - Associating map_uid of map, map_value - Always doing atomic_inc_not_zero(map->usercnt) for each call to local_kptr_xchg 1 and 2 because of inner_maps, 3 because of release_uref. But maybe not a deal breaker? > We might need two arrays though. > prog->used_allocators[] and map->used_allocators[] > The verifier would populate both at load time. > At prog unload dec refcnt in one array. > At map free dec refcnt in the other array. > Map-in-map insert/delete of new map would copy allocators[] from > outer map. > As the general suggestion to solve this problem I think > we really need to avoid run-time refcnt changes at alloc/free > even when they're per-cpu 'fast'.