On 1/29/25 01:03, Steven Rostedt wrote: > On Tue, 28 Jan 2025 15:43:13 -0800 > Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > >> > How slow is it to always do the call instead of inlining? >> >> Let's see... The additional overhead if we always call is: >> >> Little core: 2.42% >> Middle core: 1.23% >> Big core: 0.66% >> >> Not a huge deal because the overhead of memory profiling when enabled >> is much higher. So, maybe for simplicity I should indeed always call? > > That's what I was thinking, unless the other maintainers are OK with this > special logic. If it's acceptable, I would prefer to always call. But at the same time make sure the static key test is really inlined, i.e. force inline alloc_tagging_slab_alloc_hook() (see my other reply looking at the disassembly). Well or rather just open-code the contents of the alloc_tagging_slab_alloc_hook and alloc_tagging_slab_free_hook (as they look after this patch) into the callers. It's just two lines. The extra layer is just unnecessary distraction. Then it's probably inevitable the actual hook content after the static key test should not be inline even with CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT as the result would be inlined into too many places. But since we remove one call layer anyway thanks to above, even without the full inlining the resulting performance could hopefully be fine (compared to the state before your series). > -- Steve