On Tue, 30 Aug 2022 at 01:18, Delyan Kratunov <delyank@xxxxxx> wrote: > > > [...] > > > > I don't think that assumption will hold. Requiring RCU protection for > > all local kptrs means allocator cache reuse becomes harder, as > > elements are stuck in freelist until the next grace period. It > > necessitates use of larger caches. > > For some use cases where they can tolerate reuse, it might not be > > optimal. IMO the allocator should be independent of how the lifetime > > of elements is managed. > > All maps and allocations are already rcu-protected, we're not adding new here. We're > only relying on this rcu protection (c.f. the chain of call_rcu_task_trace and > call_rcu in the patchset) to prove that no program can observe an allocator pointer > that is corrupted or stale. > > > > > That said, even if you assume RCU protection, that still doesn't > > address the real problem. Yes, you can access the value without > > worrying about it moving to another map, but consider this case: > > Prog unloading, > > populate_used_allocators -> map walks map_values, tries to take > > reference to local kptr whose backing allocator is A. > > Loads kptr, meanwhile some other prog sharing access to the map value > > exchanges (kptr_xchg) another pointer into that field. > > Now you take reference to allocator A in used_allocators, while actual > > value in map is for allocator B. > > This is fine. The algorithm is conservative, it may keep allocators around longer > than they're needed. Eventually there will exist a time that this map won't be > accessible to any program, at which point both allocator A and B will be released. > > It is possible to make a more precise algorithm but given that this behavior is only > really a problem with (likely pinned) long-lived maps, it's imo not worth it for v1. > > > > > So you either have to cmpxchg and then retry if it fails (which is not > > a wait-free operation, and honestly not great imo), or if you don't > > handle this: > > Someone moved an allocated local kptr backed by B into your map, but > > you don't hold reference to it. > > You don't need a reference while something else is holding the allocator alive. The > references exist to extend the lifetime past the lifetimes of programs that can > directly reference the allocator. > > > That other program may release > > allocator map -> allocator, > > The allocator map cannot be removed while it's in used_maps of the program. On > program unload, we'll add B to the used_allocators list, as a value from B is in the > map. Only then will the allocator map be releasable. > Ahh, so prog1 and prog2 both share map M, but not allocator map A and B, so if it swaps in pointer of B into M while prog1 is unloading, it will take unneeded ref to A (it it sees kptr to A just before swap). But when prog2 will unload, it will then see that ptr value is from B so it will also take ref to B in M's used_allocators. Then A stays alive for a little while longer, but only when this race happens. But there won't be any correctness problem as both A and B are kept alive. Makes sense, but IIUC this only takes care of this specific case. It can also see 'NULL' and miss taking reference to A. prog1 uses A, M prog2 uses B, M A and B are allocator maps, M is normal hashmap, M is shared between both. prog1 is unloading: M holds kptr from A. during walk, before loading field, prog2 xchg it to NULL, M does not take ref to A. // 1 Right after it is done processing this field during its walk, prog2 now xchg it back in, now M is holding A but did not take ref to A. // 2 prog1 unloads. M's used_allocators list is empty. M is still kept alive by prog2. Now prog2 has already moved its result of kptr_xchg back into the map in step 2. Hence, prog2 execution can terminate, this ends its RCU section. Allocator A is waiting for all prior RCU readers, eventually it can be freed. Now prog2 runs again, starts a new RCU section, kptr_xchg this ptr from M, and tries to free it. Allocator A is already gone. If this is also taken care of, I'll only be poking at the code next when you post it, so that I don't waste any more of your time. But IIUC this can still cause issues, right?