Hi Paul, On 4/27/2023 9:46 PM, Paul E. McKenney wrote: > On Wed, Apr 26, 2023 at 09:20:49PM -0700, Alexei Starovoitov wrote: >> On Sun, Apr 23, 2023 at 09:55:24AM +0800, Hou Tao wrote: >>>>> ./bench htab-mem --use-case $name --max-entries 16384 \ >>>>> --full 50 -d 7 -w 3 --producers=8 --prod-affinity=0-7 >>>>> >>>>> | name | loop (k/s) | average memory (MiB) | peak memory (MiB) | >>>>> | -- | -- | -- | -- | >>>>> | no_op | 1129 | 1.15 | 1.15 | >>>>> | overwrite | 24.37 | 2.07 | 2.97 | >>>>> | batch_add_batch_del | 10.58 | 2.91 | 3.36 | >>>>> | add_del_on_diff_cpu | 13.14 | 380.66 | 633.99 | >>>> large mem for diff_cpu case needs to be investigated. >>> The main reason is that tasks-trace RCU GP is slow and there is only one >>> inflight free callback, so the CPUs which only do element addition will allocate >>> new memory from slab continuously and the CPUs which only do element deletion >>> will free these elements continuously through call_tasks_trace_rcu(), but due to >>> the slowness of tasks-trace RCU GP, these freed elements could not be freed back >>> to slab subsystem timely. >> I see. Now it makes sense. It's slow call_tasks_trace_rcu and not at all "memory can never be reused." >> Please explain things clearly in commit log. > Is this a benchmarking issue, or is this happening in real workloads? It is just a benchmark issue. The add_del_on_diff_cpu case in the benchmark simulates the hypothetical workload which will do hash map addition and deletion on different CPUs. > > If the former, one trick I use in rcutorture's callback-flooding code is > to pass the ready-to-be-freed memory directly back to the allocating CPU. > Which might be what you were getting at with your "maybe stealing from > free_list of other CPUs". Thanks, it is a good idea. I could try it later. > > If this is happening in real workloads, then I would like to better > understand that workload. > > Thanx, Paul > .