Re: [RFC PATCH bpf-next 0/6] bpf: Handle reuse in bpf memory alloc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 2/12/2023 12:33 AM, Alexei Starovoitov wrote:
> On Fri, Feb 10, 2023 at 5:10 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
>>>> Hou, are you plannning to resubmit this change? I also hit this while testing my
>>>> changes on bpf-next.
>>> Are you talking about the whole patch set or just GFP_ZERO in mem_alloc?
>>> The former will take a long time to settle.
>>> The latter is trivial.
>>> To unblock yourself just add GFP_ZERO in an extra patch?
>> Sorry for the long delay. Just find find out time to do some tests to compare
>> the performance of bzero and ctor. After it is done, will resubmit on next week.
> I still don't like ctor as a concept. In general the callbacks in the critical
> path are guaranteed to be slow due to retpoline overhead.
> Please send a patch to add GFP_ZERO.
I see. Will do. But i think it is better to know the coarse overhead of these
two methods, so I hack map_perf_test to support customizable value size for
hash_map_alloc and do some benchmarks to show the overheads of ctor and
GFP_ZERO. These benchmark are conducted on a KVM-VM with 8-cpus, it seems when
the number of allocated elements is small, the overheads of ctor and bzero are
basically the same, but when the number of allocated element increases (e.g.,
half full), the overhead of ctor will be bigger. For big value size, the
overhead of ctor and zero are basically the same, and it seems due to the main
overhead comes from slab allocation. The following is the detailed results:

* ./map_perf_test 4 8 8192 10000 $value_size

Key of htab is thread pid, so only 8 elements are allocated.

| value_size | 8      | 256    | 4K     | 16K    | 64K    | 256K   |
| --         | --     | --     | --     | --     | --     | --     |
| base       | 256604 | 261112 | 173646 | 74195  | 23138  | 6275   |
| bzero      | 253362 | 257563 | 171445 | 73303  | 22949  | 6249   |
| ctor       | 264570 | 258246 | 175048 | 72511  | 23004  | 6270   |

* ./map_perf_test 4 8 8192 100 $value_size

The key is still thread pid, so only 8 elements are allocated. decrease the loop
count to 100 to show the overhead of first allocation.

| value_size | 8      | 256    | 4K     | 16K    | 64K    | 256K   |
| --         | --     | --     | --     | --     | --     | --     |
| base       | 135662 | 137742 | 87043  | 36265  | 12501  | 4450   |
| bzero      | 139993 | 134920 | 94570  | 37190  | 12543  | 4131   |
| ctor       | 147949 | 141825 | 94321  | 38240  | 13131  | 4248   |

* ./map_perf_test 4 8 8192 1000 $value_size

Create 512 different keys per-thread, so the hash table will be half-full. Also
decrease the loop count to 1K.

| value_size | 8      | 256    | 4K     | 16K    | 64K    | 256K   |
| --         | --     | --     | --     | --     | --     | --     |
| base       | 4234   | 4289   | 1478   | 510    | 168    | 46     |
| bzero      | 3792   | 4002   | 1473   | 515    | 161    | 37     |
| ctor       | 3846   | 2198   | 1269   | 499    | 161    | 42     |

* ./map_perf_test 4 8 8192 100 $value_size

Create 512 different keys per-thread, so the hash table will be half-full. Also
decrease the loop count to 100.

| value_size | 8      | 256    | 4K     | 16K    | 64K    | 256K   |
| --         | --     | --     | --     | --     | --     | --     |
| base       | 3669   | 3419   | 1272   | 476    | 168    | 44     |
| bzero      | 3468   | 3499   | 1274   | 476    | 150    | 36     |
| ctor       | 2235   | 2312   | 1128   | 452    | 145    | 35     |
>
> Also I realized that we can make the BPF_REUSE_AFTER_RCU_GP flag usable
> without risking OOM by only waiting for normal rcu GP and not rcu_tasks_trace.
> This approach will work for inner nodes of qptrie, since bpf progs
> never see pointers to them. It will work for local storage
> converted to bpf_mem_alloc too. It wouldn't need to use its own call_rcu.
> It's also safe without uaf caveat in sleepable progs and sleepable progs
> can use explicit bpf_rcu_read_lock() when they want to avoid uaf.
> So please respin the set with rcu gp only and that new flag.
> .




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux