Re: [PATCH] mm: don't call should_failslab() for !CONFIG_FAILSLAB

Vlastimil Babka <vbabka@xxxxxxx> · Fri, 31 May 2024 11:36:22 +0200

On 5/27/24 11:34 AM, Mateusz Guzik wrote:
> +cc Linus
> 
> On Thu, Oct 07, 2021 at 05:32:52PM +0200, Vlastimil Babka wrote:
>> On 10/5/21 17:31, Jens Axboe wrote:
>> > Allocations can be a very hot path, and this out-of-line function
>> > call is noticeable.
>> > 
>> > Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
>> 
>> It used to be inline b4 (hi, Konstantin!) and then was converted to be like
>> this intentionally :/
>> 
>> See 4f6923fbb352 ("mm: make should_failslab always available for fault
>> injection")
>> 
>> And now also kernel/bpf/verifier.c contains:
>> BTF_ID(func, should_failslab)
>> 
>> I think either your or Andrew's version will break this BTF_ID thing, at the
>> very least.
>> 
>> But I do strongly agree that putting unconditionally a non-inline call into
>> slab allocator fastpath sucks. Can we make it so that bpf can only do these
>> overrides when CONFIG_FAILSLAB is enabled?
>> I don't know, perhaps putting this BTF_ID() in #ifdef as well, or providing
>> a dummy that is always available (so that nothing breaks), but doesn't
>> actually affect slab_pre_alloc_hook() unless CONFIG_FAILSLAB has been enabled?
>> 
> 
> I just ran into it while looking at kmalloc + kfree pair.
> 
> A toy test which calls this in a loop like so:
> static long noinline custom_bench(void)
> {
>         void *buf;
> 
>         while (!signal_pending(current)) {
>                 buf = kmalloc(16, GFP_KERNEL);
>                 kfree(buf);
>                 cond_resched();
>         }
> 
>         return -EINTR;
> }
> 
> ... shows this with perf top:
>    57.88%  [kernel]           [k] kfree
>    31.38%  [kernel]           [k] kmalloc_trace_noprof
>     3.20%  [kernel]           [k] should_failslab.constprop.0
> 
> A side note is that I verified majority of the time in kfree and
> kmalloc_trace_noprof is cmpxchg16b, which is both good and bad news.
> 
> As for should_failslab, it compiles to an empty func on production
> kernels and is present even when there are no supported means of
> instrumenting it. As in everyone pays for its existence, even if there
> is no way to use it.
> 
> Also note there are 3 unrelated mechanisms to alter the return code,
> which imo is 2 too many. But more importantly they are not even
> coordinated.
> 
> A hard requirement for a long term solution is to not alter the fast
> path beyond nops for hot patching.
> 
> So far I think implementing this in a clean manner would require
> agreeing on some namespace for bpf ("failprobes"?) and coordinating
> hotpatching between different mechanisms. Maybe there is a better, I
> don't know.

I've attempted something (not complete yet) here:

https://lore.kernel.org/all/20240531-fault-injection-statickeys-v1-0-a513fd0a9614@xxxxxxx/

> Here is the crux of my e-mail though:
> 1. turning should_failslab into a mandatory func call is an ok local
>    hack for the test farm, not a viable approach for production
> 2. as such it is up to the original submitter (or whoever else
>    who wants to pick up the slack) to implement something which
>    hotpatches the callsite as opposed to inducing a function call for
>    everyone
> 
> In the meantime the routine should disappear unless explicitly included
> in kernel config. The patch submitted here would be one way to do it.