On Wed, Oct 30, 2024 at 3:02 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > > >> a) lookup > >> max_entries = 8K > >> > >> before: > >> 0:hash_lookup 72347325 lookups per sec > >> > >> after: > >> 0:hash_lookup 64758890 lookups per sec > > is surprising. > > > > Two conditional branches contribute to 12% performance loss? > > Something fishy. > > Try unlikely() to hopefully recover most of it. > > After analyzing 'perf report/annotate', of course. > > Using unlikely/likely doesn't help much. It seems the big performance > gap is due to the inline of lookup_nulls_elem_raw() in > __htab_map_lookup_elem(). Still don't know the reason why > lookup_nulls_elem_raw() is not inlined after the change. After marking > the lookup_nulls_elem_raw() function as inline, the performance gap is > within ~2% for htab map lookup. For htab_map_update/delete_elem(), the > reason and the result is similar. Should I mark these two functions > (lookup_nulls_elem_raw and lookup_elem_raw) as inline in the next > revision, or should I leave it as is and try to fix the degradation in > another patch set ? from 12% to 2% by adding 'inline' to lookup_[nulls_]elem_raw() ? Certainly do it in the patch set.