On 3/18/2023 12:40 AM, Chris Lai wrote: > Might be a bug using bpf_timer on Hashmap? > With same setups using bpf_timer but with LRU_Hashmap, the memory > usage is way better: see following > > with LRU_Hashmap > 16M capacity, 1 minute bpf_timer callback/cleanup.. (pre-allocation > ~5G), memory usage peaked ~7G (Flat and does not fluctuate - unlike > Hashmap) > 32M capacity, 1 minute bpf_timer callback/cleanup.. (pre-allocation > ~8G), memory usage peaked ~12G (Flat and does not fluctuate - unlike > Hashmap) In your setup, LRU hash map is preallocated and normal hash map is not preallocated (aka BPF_F_NO_PREALLOC), right ? If it is true, could you please test the memory usage of preallocated hash map ? Also could you please share the version of used Linux kernel and the way on how to create hash map and operate on hash map ? > > > > On Thu, Mar 16, 2023 at 6:22 PM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: >> On Thu, Mar 16, 2023 at 12:18 PM Chris Lai <chrlai@xxxxxxxxxxxxx> wrote: >>> Hello, >>> Using BPF Hashmap with bpf_timer for each entry value and callback to >>> delete the entry after 1 minute. >>> Constantly creating load to insert elements onto the map, we have >>> observed the following: >>> -3M map capacity, 1 minute bpf_timer callback/cleanup, memory usage >>> peaked around 5GB >>> -16M map capacity, 1 minute bpf_timer callback/cleanup, memory usage >>> peaked around 34GB >>> -24M map capacity, 1 minute bpf_timer callback/cleanup, memory usage >>> peaked around 55GB >>> Wondering if this is expected and what is causing the huge increase in >>> memory as we increase the number of elements inserted onto the map. >>> Thank you. Do the addition and deletion of hash map entry happen on different CPU ? If it is true and bpf memory allocator is used (kernel version >= 6.1), the memory blow-up may be explainable. Because the new allocation can not reuse the memory freed by entry deletion, so the memory usage will increase rapidly. I had tested such case and also written one selftest for such case, but it seems it only can be mitigated [1], because RCU tasks trace GP is slow. If your setup is sticking to non-preallocated hash map, you could first try to add "rcupdate.rcu_task_enqueue_lim=nr_cpus" in kernel bootcmd to mitigate the problem. [1] https://lore.kernel.org/bpf/20221209010947.3130477-1-houtao@xxxxxxxxxxxxxxx/ >> That's not normal. Do you have a small reproducer? > .