On Fri, May 31, 2024 at 2:33 AM Vlastimil Babka <vbabka@xxxxxxx> wrote: > > Since commit 4f6923fbb352 ("mm: make should_failslab always available for > fault injection") should_failslab() is unconditionally a noinline > function. This adds visible overhead to the slab allocation hotpath, > even if the function is empty. With CONFIG_FAILSLAB=y there's additional > overhead when the functionality is not enabled by a boot parameter or > debugfs. > > The overhead can be eliminated with a static key around the callsite. > Fault injection and error injection frameworks can now be told that the > this function has a static key associated, and are able to enable and > disable it accordingly. > > Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx> > --- > mm/failslab.c | 2 +- > mm/slab.h | 3 +++ > mm/slub.c | 10 +++++++--- > 3 files changed, 11 insertions(+), 4 deletions(-) > > diff --git a/mm/failslab.c b/mm/failslab.c > index ffc420c0e767..878fd08e5dac 100644 > --- a/mm/failslab.c > +++ b/mm/failslab.c > @@ -9,7 +9,7 @@ static struct { > bool ignore_gfp_reclaim; > bool cache_filter; > } failslab = { > - .attr = FAULT_ATTR_INITIALIZER, > + .attr = FAULT_ATTR_INITIALIZER_KEY(&should_failslab_active.key), > .ignore_gfp_reclaim = true, > .cache_filter = false, > }; > diff --git a/mm/slab.h b/mm/slab.h > index 5f8f47c5bee0..792e19cb37b8 100644 > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -11,6 +11,7 @@ > #include <linux/memcontrol.h> > #include <linux/kfence.h> > #include <linux/kasan.h> > +#include <linux/jump_label.h> > > /* > * Internal slab definitions > @@ -160,6 +161,8 @@ static_assert(IS_ALIGNED(offsetof(struct slab, freelist), sizeof(freelist_aba_t) > */ > #define slab_page(s) folio_page(slab_folio(s), 0) > > +DECLARE_STATIC_KEY_FALSE(should_failslab_active); > + > /* > * If network-based swap is enabled, sl*b must keep track of whether pages > * were allocated from pfmemalloc reserves. > diff --git a/mm/slub.c b/mm/slub.c > index 0809760cf789..3bb579760a37 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -3874,13 +3874,15 @@ static __always_inline void maybe_wipe_obj_freeptr(struct kmem_cache *s, > 0, sizeof(void *)); > } > > +DEFINE_STATIC_KEY_FALSE(should_failslab_active); > + > noinline int should_failslab(struct kmem_cache *s, gfp_t gfpflags) > { > if (__should_failslab(s, gfpflags)) > return -ENOMEM; > return 0; > } > -ALLOW_ERROR_INJECTION(should_failslab, ERRNO); > +ALLOW_ERROR_INJECTION_KEY(should_failslab, ERRNO, &should_failslab_active); > > static __fastpath_inline > struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s, gfp_t flags) > @@ -3889,8 +3891,10 @@ struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s, gfp_t flags) > > might_alloc(flags); > > - if (unlikely(should_failslab(s, flags))) > - return NULL; > + if (static_branch_unlikely(&should_failslab_active)) { > + if (should_failslab(s, flags)) > + return NULL; > + } makes sense. Acked-by: Alexei Starovoitov <ast@xxxxxxxxxx> Do you have any microbenchmark numbers before/after this optimization?