Re: [RFC PATCH bpf-next 0/6] bpf: Handle reuse in bpf memory alloc

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Tue, 14 Feb 2023 18:42:41 -0800

On Tue, Feb 14, 2023 at 6:36 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> On 2/12/2023 12:33 AM, Alexei Starovoitov wrote:
> > On Fri, Feb 10, 2023 at 5:10 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
> >>>> Hou, are you plannning to resubmit this change? I also hit this while testing my
> >>>> changes on bpf-next.
> >>> Are you talking about the whole patch set or just GFP_ZERO in mem_alloc?
> >>> The former will take a long time to settle.
> >>> The latter is trivial.
> >>> To unblock yourself just add GFP_ZERO in an extra patch?
> >> Sorry for the long delay. Just find find out time to do some tests to compare
> >> the performance of bzero and ctor. After it is done, will resubmit on next week.
> > I still don't like ctor as a concept. In general the callbacks in the critical
> > path are guaranteed to be slow due to retpoline overhead.
> > Please send a patch to add GFP_ZERO.
> I see. Will do. But i think it is better to know the coarse overhead of these
> two methods, so I hack map_perf_test to support customizable value size for
> hash_map_alloc and do some benchmarks to show the overheads of ctor and
> GFP_ZERO. These benchmark are conducted on a KVM-VM with 8-cpus, it seems when
> the number of allocated elements is small, the overheads of ctor and bzero are
> basically the same, but when the number of allocated element increases (e.g.,
> half full), the overhead of ctor will be bigger. For big value size, the
> overhead of ctor and zero are basically the same, and it seems due to the main
> overhead comes from slab allocation. The following is the detailed results:

and with retpoline?