Hi, On 6/10/2023 2:23 AM, Alexei Starovoitov wrote: > On Thu, Jun 8, 2023 at 7:08 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: >> >> (1) non-preallocated + no bpf memory allocator (v6.0.19) >> use kmalloc() + call_rcu >> >> | name | loop (k/s)| average memory (MiB)| peak memory (MiB)| >> | -- | -- | -- | -- | >> | no_op | 681.40 | 0.87 | 1.00 | >> | overwrite | 8.56 | 38.86 | 88.42 | >> | batch_add_batch_del| 6.74 | 41.28 | 69.70 | >> | add_del_on_diff_cpu| 4.68 | 3.43 | 5.70 | >> >> (2) preallocated >> OPTS=--preallocated >> >> | name | loop (k/s)| average memory (MiB)| peak memory (MiB)| >> | -- | -- | -- | -- | >> | no_op | 673.95 | 1.98 | 1.98 | >> | overwrite | 114.63 | 1.99 | 1.99 | >> | batch_add_batch_del| 78.34 | 2.04 | 2.06 | >> | add_del_on_diff_cpu| 6.41 | 2.23 | 2.54 | >> >> (3) normal bpf memory allocator >> >> | name | loop (k/s)| average memory (MiB)| peak memory (MiB)| >> | -- | -- | -- | -- | >> | no_op | 656.20 | 0.99 | 0.99 | >> | overwrite | 81.21 | 1.10 | 2.49 | >> | batch_add_batch_del| 18.40 | 2.13 | 2.62 | >> | add_del_on_diff_cpu| 5.38 | 10.40 | 18.05 | > I have a feeling that you didn't remeasure things and just copy pasted > above from v4. I indeed reran the benchmark for v5. But I did in a wrong kernel version, so the benchmark for normal bpf memory allocator doesn't seem right. > I see vastly different numbers in v5. > and peak memory usage is broken. My bad. Forgot to include the change of htab_mem_report_final() in the final v5. The call of cleanup_cgroup_environment() should be moved to the end of htab_mem_report_final(). Will fix in v6. > It always shows: > peak memory usage 0.00MiB