On Mon, Dec 20, 2021 at 04:56:35PM +0800, Muchun Song wrote: > We currently allocate scope for every memcg to be able to tracked on > every superblock instantiated in the system, regardless of whether > that superblock is even accessible to that memcg. > > These huge memcg counts come from container hosts where memcgs are > confined to just a small subset of the total number of superblocks > that instantiated at any given point in time. > > For these systems with huge container counts, list_lru does not need > the capability of tracking every memcg on every superblock. What it > comes down to is that adding the memcg to the list_lru at the first > insert. So introduce kmem_cache_alloc_lru to allocate objects and its > list_lru. In the later patch, we will convert all inode and dentry > allocation from kmem_cache_alloc to kmem_cache_alloc_lru. > > Signed-off-by: Muchun Song <songmuchun@xxxxxxxxxxxxx> > --- > include/linux/list_lru.h | 4 ++ > include/linux/memcontrol.h | 14 ++++++ > include/linux/slab.h | 3 ++ > mm/list_lru.c | 104 +++++++++++++++++++++++++++++++++++++++++---- > mm/memcontrol.c | 14 ------ > mm/slab.c | 39 +++++++++++------ > mm/slab.h | 25 +++++++++-- > mm/slob.c | 6 +++ > mm/slub.c | 42 ++++++++++++------ > 9 files changed, 198 insertions(+), 53 deletions(-) > > diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h > index 729a27b6ff53..ab912c49334f 100644 > --- a/include/linux/list_lru.h > +++ b/include/linux/list_lru.h > @@ -56,6 +56,8 @@ struct list_lru { > struct list_head list; > int shrinker_id; > bool memcg_aware; > + /* protects ->mlrus->mlru[i] */ > + spinlock_t lock; > /* for cgroup aware lrus points to per cgroup lists, otherwise NULL */ > struct list_lru_memcg __rcu *mlrus; > #endif > @@ -72,6 +74,8 @@ int __list_lru_init(struct list_lru *lru, bool memcg_aware, > #define list_lru_init_memcg(lru, shrinker) \ > __list_lru_init((lru), true, NULL, shrinker) > > +int memcg_list_lru_alloc(struct mem_cgroup *memcg, struct list_lru *lru, > + gfp_t gfp); > int memcg_update_all_list_lrus(int num_memcgs); > void memcg_drain_all_list_lrus(int src_idx, struct mem_cgroup *dst_memcg); > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index 0c5c403f4be6..561ba47760db 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -520,6 +520,20 @@ static inline struct mem_cgroup *page_memcg_check(struct page *page) > return (struct mem_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); > } > > +static inline struct mem_cgroup *get_mem_cgroup_from_objcg(struct obj_cgroup *objcg) > +{ > + struct mem_cgroup *memcg; > + > + rcu_read_lock(); > +retry: > + memcg = obj_cgroup_memcg(objcg); > + if (unlikely(!css_tryget(&memcg->css))) > + goto retry; > + rcu_read_unlock(); > + > + return memcg; > +} > + > #ifdef CONFIG_MEMCG_KMEM > /* > * folio_memcg_kmem - Check if the folio has the memcg_kmem flag set. > diff --git a/include/linux/slab.h b/include/linux/slab.h > index 181045148b06..eccbd21d3753 100644 > --- a/include/linux/slab.h > +++ b/include/linux/slab.h > @@ -135,6 +135,7 @@ > > #include <linux/kasan.h> > > +struct list_lru; > struct mem_cgroup; > /* > * struct kmem_cache related prototypes > @@ -425,6 +426,8 @@ static __always_inline unsigned int __kmalloc_index(size_t size, > > void *__kmalloc(size_t size, gfp_t flags) __assume_kmalloc_alignment __alloc_size(1); > void *kmem_cache_alloc(struct kmem_cache *s, gfp_t flags) __assume_slab_alignment __malloc; > +void *kmem_cache_alloc_lru(struct kmem_cache *s, struct list_lru *lru, > + gfp_t gfpflags) __assume_slab_alignment __malloc; I'm not a big fan of this patch: I don't see why preparing the lru infrastructure has to be integrated that deep into the slab code. Why can't kmem_cache_alloc_lru() be a simple wrapper like (pseudo-code): void *kmem_cache_alloc_lru(struct kmem_cache *s, struct list_lru *lru, gfp_t gfpflags) { if (necessarily) prepare_lru_infra(); return kmem_cache_alloc(); } In the current form the patch breaks the API layering. Maybe it's strictly necessarily, but we should have a __very__ strong reason for this. Thanks! cc Slab maintainers