Re: [PATCH RFC 6/6] mm, slub: sheaf prefilling for guaranteed allocations

Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> · Tue, 19 Nov 2024 11:29:43 +0900

On Mon, Nov 18, 2024 at 11:26 PM Vlastimil Babka <vbabka@xxxxxxx> wrote:
>
> On 11/18/24 14:13, Hyeonggon Yoo wrote:
> > On Wed, Nov 13, 2024 at 1:39 AM Vlastimil Babka <vbabka@xxxxxxx> wrote:
> >> +
> >> +/*
> >> + * Allocate from a sheaf obtained by kmem_cache_prefill_sheaf()
> >> + *
> >> + * Guaranteed not to fail as many allocations as was the requested count.
> >> + * After the sheaf is emptied, it fails - no fallback to the slab cache itself.
> >> + *
> >> + * The gfp parameter is meant only to specify __GFP_ZERO or __GFP_ACCOUNT
> >> + * memcg charging is forced over limit if necessary, to avoid failure.
> >> + */
> >> +void *
> >> +kmem_cache_alloc_from_sheaf_noprof(struct kmem_cache *s, gfp_t gfp,
> >> +                                  struct slab_sheaf *sheaf)
> >> +{
> >> +       void *ret = NULL;
> >> +       bool init;
> >> +
> >> +       if (sheaf->size == 0)
> >> +               goto out;
> >> +
> >> +       ret = sheaf->objects[--sheaf->size];
> >> +
> >> +       init = slab_want_init_on_alloc(gfp, s);
> >> +
> >> +       /* add __GFP_NOFAIL to force successful memcg charging */
> >> +       slab_post_alloc_hook(s, NULL, gfp | __GFP_NOFAIL, 1, &ret, init, s->object_size);
> >
> > Maybe I'm missing something, but how can this be used for non-sleepable contexts
> > if __GFP_NOFAIL is used? I think we have to charge them when the sheaf
>
> AFAIK it forces memcg to simply charge even if allocated memory goes over
> the memcg limit. So there's no issue with a non-sleepable context, there
> shouldn't be memcg reclaim happening in that case.

Ok, but I am still worried about mem alloc profiling/memcg trying to
allocate some memory
with __GFP_NOFAIL flag and eventually passing it to the buddy allocator,
which does not want __GFP_NOFAIL without __GFP_DIRECT_RECLAIM?
e.g.) memcg hook calls
alloc_slab_obj_exts()->kcalloc_node()->....->alloc_pages()

> > is returned
> > via kmem_cache_prefill_sheaf(), just like users of bulk alloc/free?
>
> That would be very costly to charge/uncharge if most of the objects are not
> actually used - it's what we want to avoid here.
> Going over the memcgs limit a bit in a very rare case isn't considered such
> an issue, for example Linus advocated such approach too in another context.

Thanks for the explanation! That was a point I was missing.

> > Best,
> > Hyeonggon
> >
> >> +out:
> >> +       trace_kmem_cache_alloc(_RET_IP_, ret, s, gfp, NUMA_NO_NODE);
> >> +
> >> +       return ret;
> >> +}
> >> +
> >>  /*
> >>   * To avoid unnecessary overhead, we pass through large allocation requests
> >>   * directly to the page allocator. We use __GFP_COMP, because we will need to
> >>
> >> --
> >> 2.47.0
> >>
>