Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if possible

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Fri, 3 Mar 2023 12:58:13 -0800

On Fri, Mar 3, 2023 at 12:39 PM Mateusz Guzik <mjguzik@xxxxxxxxx> wrote:
>
> I think there is a systemic problem which comes with the kzalloc API

Well, it's not necessarily the API that is bad, but the implementation.

We could easily make kzalloc() with a constant size just expand to
kmalloc+memset, and get the behavior you want.

We already do magical things for "find the right slab bucket" part of
kmalloc too for constant sizes. It's changed over the years, but that
policy goes back a long long time. See

   https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=95203fe78007f9ab3aebb96606473ae18c00a5a8

from the BK history tree.

Exactly because some things are worth optimizing for when the size is
known at compile time.

Maybe just extending kzalloc() similarly? Trivial and entirely untested patch:

   --- a/include/linux/slab.h
   +++ b/include/linux/slab.h
   @@ -717,6 +717,12 @@ static inline void *kmem_cache_zalloc(struct
kmem_cache *k, gfp_t flags)
     */
    static inline __alloc_size(1) void *kzalloc(size_t size, gfp_t flags)
    {
   +    if (__builtin_constant_p(size)) {
   +            void *ret = kmalloc(size, flags);
   +            if (ret)
   +                    memset(ret, 0, size);
   +            return ret;
   +    }
        return kmalloc(size, flags | __GFP_ZERO);
    }

This may well be part of what has changed over the years. People have
done a *lot* of pseudo-automated "kmalloc+memset -> kzalloc" code
simplification. And in the process we've lost a lot of good
optimizations.

I used to do profiling religiously, but these days I only do it for
particular areas (usually just the walking part of pathname lookup)

             Linus