On Sun, 21 Jan 2024, Matthew Wilcox wrote:
I'd like to keep this topic relevant to as many people as possible. I can add a proposal for a topic on both the PCP and Buddy allocators (I have a series of Thoughts on how the PCP allocator works in a memdesc world that I haven't written down & sent out yet).
Well the PCP cache's (I would not call it an allocator) intent is to provide cache hot / tlb hot pages. In some ways this is like the SLAB/SLUB situation. I.e. lists of objects vs. service objects that are locally related.
Can we come up with a design that uses a huge page (or some arbitrary page size) and the breaks out portions of the large page? That way potentially TLB use can be reduced (multiple sections of a large page use the same TLB) and defragmentation occurs because allocs and frees focus on a selection of large memory sections.
This is rougly equivalent to a per cpu page (folio?) in SLUB where cache hot objects can be served from a single memory section and also freed back without too much interaction with higher level more expensive components of the allocator.