+cc Linus On Thu, Oct 07, 2021 at 05:32:52PM +0200, Vlastimil Babka wrote: > On 10/5/21 17:31, Jens Axboe wrote: > > Allocations can be a very hot path, and this out-of-line function > > call is noticeable. > > > > Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> > > It used to be inline b4 (hi, Konstantin!) and then was converted to be like > this intentionally :/ > > See 4f6923fbb352 ("mm: make should_failslab always available for fault > injection") > > And now also kernel/bpf/verifier.c contains: > BTF_ID(func, should_failslab) > > I think either your or Andrew's version will break this BTF_ID thing, at the > very least. > > But I do strongly agree that putting unconditionally a non-inline call into > slab allocator fastpath sucks. Can we make it so that bpf can only do these > overrides when CONFIG_FAILSLAB is enabled? > I don't know, perhaps putting this BTF_ID() in #ifdef as well, or providing > a dummy that is always available (so that nothing breaks), but doesn't > actually affect slab_pre_alloc_hook() unless CONFIG_FAILSLAB has been enabled? > I just ran into it while looking at kmalloc + kfree pair. A toy test which calls this in a loop like so: static long noinline custom_bench(void) { void *buf; while (!signal_pending(current)) { buf = kmalloc(16, GFP_KERNEL); kfree(buf); cond_resched(); } return -EINTR; } ... shows this with perf top: 57.88% [kernel] [k] kfree 31.38% [kernel] [k] kmalloc_trace_noprof 3.20% [kernel] [k] should_failslab.constprop.0 A side note is that I verified majority of the time in kfree and kmalloc_trace_noprof is cmpxchg16b, which is both good and bad news. As for should_failslab, it compiles to an empty func on production kernels and is present even when there are no supported means of instrumenting it. As in everyone pays for its existence, even if there is no way to use it. Also note there are 3 unrelated mechanisms to alter the return code, which imo is 2 too many. But more importantly they are not even coordinated. A hard requirement for a long term solution is to not alter the fast path beyond nops for hot patching. So far I think implementing this in a clean manner would require agreeing on some namespace for bpf ("failprobes"?) and coordinating hotpatching between different mechanisms. Maybe there is a better, I don't know. Here is the crux of my e-mail though: 1. turning should_failslab into a mandatory func call is an ok local hack for the test farm, not a viable approach for production 2. as such it is up to the original submitter (or whoever else who wants to pick up the slack) to implement something which hotpatches the callsite as opposed to inducing a function call for everyone In the meantime the routine should disappear unless explicitly included in kernel config. The patch submitted here would be one way to do it.