On Tue, Feb 14, 2023 at 6:36 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > > Hi, > > On 2/12/2023 12:33 AM, Alexei Starovoitov wrote: > > On Fri, Feb 10, 2023 at 5:10 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > >>>> Hou, are you plannning to resubmit this change? I also hit this while testing my > >>>> changes on bpf-next. > >>> Are you talking about the whole patch set or just GFP_ZERO in mem_alloc? > >>> The former will take a long time to settle. > >>> The latter is trivial. > >>> To unblock yourself just add GFP_ZERO in an extra patch? > >> Sorry for the long delay. Just find find out time to do some tests to compare > >> the performance of bzero and ctor. After it is done, will resubmit on next week. > > I still don't like ctor as a concept. In general the callbacks in the critical > > path are guaranteed to be slow due to retpoline overhead. > > Please send a patch to add GFP_ZERO. > I see. Will do. But i think it is better to know the coarse overhead of these > two methods, so I hack map_perf_test to support customizable value size for > hash_map_alloc and do some benchmarks to show the overheads of ctor and > GFP_ZERO. These benchmark are conducted on a KVM-VM with 8-cpus, it seems when > the number of allocated elements is small, the overheads of ctor and bzero are > basically the same, but when the number of allocated element increases (e.g., > half full), the overhead of ctor will be bigger. For big value size, the > overhead of ctor and zero are basically the same, and it seems due to the main > overhead comes from slab allocation. The following is the detailed results: and with retpoline?