Hi, On 1/2/2023 2:48 AM, Yonghong Song wrote: > > > On 12/31/22 5:26 PM, Alexei Starovoitov wrote: >> On Fri, Dec 30, 2022 at 12:11:45PM +0800, Hou Tao wrote: >>> From: Hou Tao <houtao1@xxxxxxxxxx> >>> >>> Hi, >>> >>> The patchset tries to fix the problems found when checking how htab map >>> handles element reuse in bpf memory allocator. The immediate reuse of >>> freed elements may lead to two problems in htab map: >>> >>> (1) reuse will reinitialize special fields (e.g., bpf_spin_lock) in >>> htab map value and it may corrupt lookup procedure with BFP_F_LOCK >>> flag which acquires bpf-spin-lock during value copying. The >>> corruption of bpf-spin-lock may result in hard lock-up. >>> (2) lookup procedure may get incorrect map value if the found element is >>> freed and then reused. >>> >>> Because the type of htab map elements are the same, so problem #1 can be >>> fixed by supporting ctor in bpf memory allocator. The ctor initializes >>> these special fields in map element only when the map element is newly >>> allocated. If it is just a reused element, there will be no >>> reinitialization. >> >> Instead of adding the overhead of ctor callback let's just >> add __GFP_ZERO to flags in __alloc(). >> That will address the issue 1 and will make bpf_mem_alloc behave just >> like percpu_freelist, so hashmap with BPF_F_NO_PREALLOC and default >> will behave the same way. > > Patch https://lore.kernel.org/all/20220809213033.24147-3-memxor@xxxxxxxxx/ > tried to address a similar issue for lru hash table. > Maybe we need to do similar things after bpf_mem_cache_alloc() for > hash table? IMO ctor or __GFP_ZERO will fix the issue. Did I miss something here ? > > >> >>> Problem #2 exists for both non-preallocated and preallocated htab map. >>> By adding seq in htab element, doing reuse check and retrying the >>> lookup procedure may be a feasible solution, but it will make the >>> lookup API being hard to use, because the user needs to check whether >>> the found element is reused or not and repeat the lookup procedure if it >>> is reused. A simpler solution would be just disabling freed elements >>> reuse and freeing these elements after lookup procedure ends. >> >> You've proposed this 'solution' twice already in qptrie thread and both >> times the answer was 'no, we cannot do this' with reasons explained. >> The 3rd time the answer is still the same. >> This 'issue 2' existed in hashmap since very beginning for many years. >> It's a known quirk. There is nothing to fix really. >> >> The graph apis (aka new gen data structs) with link list and rbtree are >> in active development. Soon bpf progs will be able to implement their own >> hash maps with explicit bpf_rcu_read_lock. At that time the progs will >> be making the trade off between performance and lookup/delete race. >> So please respin with just __GFP_ZERO and update the patch 6 >> to check for lockup only.