Re: [Lsf-pc] [LSF/MM/BPF TOPIC] SLUB: what's next?

Roman Gushchin <roman.gushchin@xxxxxxxxx> · Mon, 6 May 2024 14:04:54 -0700

> On May 2, 2024, at 2:26 AM, Vlastimil Babka <vbabka@xxxxxxx> wrote:
> 
> 
> 
>> On 5/2/24 09:59, Michal Hocko wrote:
>>> On Tue 30-04-24 17:42:18, Vlastimil Babka wrote:
>>> Hi,
>>> 
>>> I'd like to propose a session about the next steps for SLUB. This is
>>> different from the BOF about sheaves that Matthew suggested, which would be
>>> not suitable for the whole group due to being not fleshed out enough yet.
>>> But the session could be scheduled after the BOF so if we do brainstorm
>>> something promising there, the result could be discussed as part of the full
>>> session.
>>> 
>>> Aside from that my preliminary plan is to discuss:
>>> 
>>> - what was made possible by reducing the slab allocators implementations to
>>> a single one, and what else could be done now with a single implementation
>>> 
>>> - the work-in-progress work (for now in the context of maple tree) on SLUB
>>> per-cpu array caches and preallocation
>>> 
>>> - what functionality would SLUB need to gain so the extra caching done by
>>> bpf allocator on top wouldn't be necessary? (kernel/bpf/memalloc.c)
>>> 
>>> - similar wrt lib/objpool.c (did you even noticed it was added? :)
>>> 
>>> - maybe the mempool functionality could be better integrated as well?
>>> 
>>> - are there more cases where people have invented layers outside mm and that
>>> could be integrated with some effort? IIRC io_uring also has some caching on
>>> top currently...
>>> 
>>> - better/more efficient memcg integration?

This is definitely an interesting topic, especially in a light of recent slab accounting performance conversations with Linus. Unfortunately I’m not attending in person this year, but happy to join virtually if it’s possible.

It’s not yet entirely clear to me if the kmem accounting performance problem exists outside of some micro-benchmarks.

Additionally, Linus proposed to optimize for cases when allocations might be short-living. In the proposed form it would complicate call sites significantly, but maybe we need some sort of transactional api, e.g.:

memcg_kmem_local_accounting_start();
p1 = kmalloc(GFP_ACCOUNT_LOCAL);
p2 = kmalloc(GFP_ACCOUNT_LOCAL);
…
kfree(p1);
memcg_kmem_local_accounting_commit();

In this case all allocations within the transaction will be saved to some temporarily buffer and not fully accounted until memcg_kmem_local_accounting_commit(). This will make them way faster. But a user should guarantee that these allocations won’t be freed from any other context until memcg_kmem_local_accounting_commit().

Thanks!