On Wed, Apr 26, 2023 at 09:20:49PM -0700, Alexei Starovoitov wrote: > On Sun, Apr 23, 2023 at 09:55:24AM +0800, Hou Tao wrote: > > > > > >> ./bench htab-mem --use-case $name --max-entries 16384 \ > > >> --full 50 -d 7 -w 3 --producers=8 --prod-affinity=0-7 > > >> > > >> | name | loop (k/s) | average memory (MiB) | peak memory (MiB) | > > >> | -- | -- | -- | -- | > > >> | no_op | 1129 | 1.15 | 1.15 | > > >> | overwrite | 24.37 | 2.07 | 2.97 | > > >> | batch_add_batch_del | 10.58 | 2.91 | 3.36 | > > >> | add_del_on_diff_cpu | 13.14 | 380.66 | 633.99 | > > > large mem for diff_cpu case needs to be investigated. > > The main reason is that tasks-trace RCU GP is slow and there is only one > > inflight free callback, so the CPUs which only do element addition will allocate > > new memory from slab continuously and the CPUs which only do element deletion > > will free these elements continuously through call_tasks_trace_rcu(), but due to > > the slowness of tasks-trace RCU GP, these freed elements could not be freed back > > to slab subsystem timely. > > I see. Now it makes sense. It's slow call_tasks_trace_rcu and not at all "memory can never be reused." > Please explain things clearly in commit log. Is this a benchmarking issue, or is this happening in real workloads? If the former, one trick I use in rcutorture's callback-flooding code is to pass the ready-to-be-freed memory directly back to the allocating CPU. Which might be what you were getting at with your "maybe stealing from free_list of other CPUs". If this is happening in real workloads, then I would like to better understand that workload. Thanx, Paul