Re: [RFC bpf-next v2 1/4] selftests/bpf: Add benchmark for bpf memory allocator

Hou Tao <houtao@xxxxxxxxxxxxxxx> · Fri, 28 Apr 2023 14:13:57 +0800

Hi Paul,

On 4/27/2023 9:46 PM, Paul E. McKenney wrote:
> On Wed, Apr 26, 2023 at 09:20:49PM -0700, Alexei Starovoitov wrote:
>> On Sun, Apr 23, 2023 at 09:55:24AM +0800, Hou Tao wrote:
>>>>> ./bench htab-mem --use-case $name --max-entries 16384 \
>>>>> 	--full 50 -d 7 -w 3 --producers=8 --prod-affinity=0-7
>>>>>
>>>>> | name                | loop (k/s) | average memory (MiB) | peak memory (MiB) |
>>>>> | --                  | --         | --                   | --                |
>>>>> | no_op               | 1129       | 1.15                 | 1.15              |
>>>>> | overwrite           | 24.37      | 2.07                 | 2.97              |
>>>>> | batch_add_batch_del | 10.58      | 2.91                 | 3.36              |
>>>>> | add_del_on_diff_cpu | 13.14      | 380.66               | 633.99            |
>>>> large mem for diff_cpu case needs to be investigated.
>>> The main reason is that tasks-trace RCU GP is slow and there is only one
>>> inflight free callback, so the CPUs which only do element addition will allocate
>>> new memory from slab continuously and the CPUs which only do element deletion
>>> will free these elements continuously through call_tasks_trace_rcu(), but due to
>>> the slowness of tasks-trace RCU GP, these freed elements could not be freed back
>>> to slab subsystem timely.
>> I see. Now it makes sense. It's slow call_tasks_trace_rcu and not at all "memory can never be reused."
>> Please explain things clearly in commit log.
> Is this a benchmarking issue, or is this happening in real workloads?
It is just a benchmark issue. The add_del_on_diff_cpu case in the
benchmark simulates the hypothetical workload which will do hash map
addition and deletion on different CPUs.
>
> If the former, one trick I use in rcutorture's callback-flooding code is
> to pass the ready-to-be-freed memory directly back to the allocating CPU.
> Which might be what you were getting at with your "maybe stealing from
> free_list of other CPUs".
Thanks, it is a good idea. I could try it later.
>
> If this is happening in real workloads, then I would like to better
> understand that workload.
>
> 							Thanx, Paul
> .