On Mon, Feb 24, 2025 at 05:13:25PM +0100, Vlastimil Babka wrote: > Hi, > > I'd like to propose a session about the SLUB allocator. > > Mainly I would like to discuss the addition of the sheaves caching layer, > the latest RFC posted at [1]. > > The goals of that work is to: > > - Reduce fastpath overhead. The current freeing fastpath only can be used if > the same target slab is still the cpu slab, which can be only expected for a > very short term allocations. Further improvements should come from the new > local_trylock_t primitive. > > - Improve efficiency of users such as like maple tree, thanks to more > efficient preallocations, and kfree_rcu batching/reusal > > - Hopefully also facilitate further changes needed for bpf allocations, also > via local_trylock_t, that could possibly extend to the other parts of the > implementation as needed. > > The controversial discussion points I expect about this approach are: > > - Either sheaves will not support NUMA restrictions (as in current RFC), or > bring back the alien cache flushing issues of SLAB (or there's a better idea?) > > - Will it be possible to eventually have sheaves enabled for every cache and > replace the current slub's fastpaths with it? Arguably these are also not > very efficient when NUMA-restricted allocations are requested for varying > NUMA nodes (cpu slab is flushed if it's from a wrong node, to load a slab > from the requested node). > > Besides sheaves, I'd like to summarize recent kfree_rcu() changes and we > could discuss further improvements to that. > > Also we can discuss what's needed to support bpf allocations. I've talked > about it last year, but then focused on other things, so Alexei has been > driving that recently (so far in the page allocator). What about pre-memcg-charged sheaves? We had to disable memcg charging of some kernel allocations and I think sheaves can help in reenabling it.