Re: a case for a destructor for slub: mm_struct

Vlastimil Babka <vbabka@xxxxxxx> · Mon, 17 Mar 2025 10:02:48 +0100

On 3/17/25 06:42, Harry Yoo wrote:
> On Fri, Mar 14, 2025 at 01:32:16PM +0100, Mateusz Guzik wrote:
>> 
>> It's a spinlock which disables interrupts around itself, so it should
>> not be a problem.
>> 
>> > > > > there may be spurious mm_struct's hanging out and eating pcpu resources.
>> > > > > Something can be added to reclaim those by the pcpu allocator.
>> > > >
>> > > > Not sure if I follow. What do you mean by spurious mm_struct, and how
>> > > > does the pcpu allocator reclaim that?
>> > > >
>> > >
>> > > Suppose a workload was ran which created tons of mm_struct. The
>> > > workload is done and they can be reclaimed, but hang out just in case.
>> > >
>> > > Another workload showed up, but one which wants to do many percpu
>> > > allocs and is not mm_struct-heavy.
>> > >
>> > > In case of resource shortage it would be good if the percpu allocator
>> > > knew how to reclaim the known cached-but-not-used memory instead of
>> > > grabbing new patches.
>> > >
>> > > As for how to get there, so happens the primary consumer (percpu
>> > > counters) already has a global list of all allocated objects. The
>> > > allocator could walk it and reclaim as needed.
>> >
>> > You mean reclaiming per-cpu objects along withthe slab objects that uses them?
>> > That sounds like a new slab shrinker for mm_struct?
>> >
>> 
>> at least the per-cpu thing, mm_struct itself optionally
> 
> If we allow reclaiming per-cpu stuff only biut do not reclaim
> the slab object that contains it...
> 
> Does that mean the users of the cache need to check if the percpu
> memory has been reclaimed and if so, should call init routines (e.g.,
> mm_init())?

That sounds like something we'd better avoid? Think it would need to imply
some locking between the shrinker and slab allocator so it doesn't hand out
a mm_struct where its percpu memory is reclaimed.

I hope it's enough if we're able to shrink what slab allocator has cached in
per-cpu (partial) slabs, there's already flushing of that from e.g. sysfs
but can't recall if there's a shrinker. Of course there will always be free
mm_struct objects in partially full slabs due to fragmentation, but I doubt
we'd need to worry specifically about the percpu memory those "own".