The patch titled Subject: mm, slab: extend slab/shrink to shrink all memcg caches has been added to the -mm tree. Its filename is mm-slab-extend-slab-shrink-to-shrink-all-memcg-caches.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-slab-extend-slab-shrink-to-shrink-all-memcg-caches.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-slab-extend-slab-shrink-to-shrink-all-memcg-caches.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Waiman Long <longman@xxxxxxxxxx> Subject: mm, slab: extend slab/shrink to shrink all memcg caches Currently, a value of '1" is written to /sys/kernel/slab/<slab>/shrink file to shrink the slab by flushing out all the per-cpu slabs and free slabs in partial lists. This can be useful to squeeze out a bit more memory under extreme condition as well as making the active object counts in /proc/slabinfo more accurate. This usually applies only to the root caches, as the SLUB_MEMCG_SYSFS_ON option is usually not enabled and "slub_memcg_sysfs=1" not set. Even if memcg sysfs is turned on, it is too cumbersome and impractical to manage all those per-memcg sysfs files in a real production system. So there is no practical way to shrink memcg caches. Fix this by enabling a proper write to the shrink sysfs file of the root cache to scan all the available memcg caches and shrink them as well. For a non-root memcg cache (when SLUB_MEMCG_SYSFS_ON or slub_memcg_sysfs is on), only that cache will be shrunk when written. On a 2-socket 64-core 256-thread arm64 system with 64k page after a parallel kernel build, the the amount of memory occupied by slabs before shrinking slabs were: # grep task_struct /proc/slabinfo task_struct 53137 53192 4288 61 4 : tunables 0 0 0 : slabdata 872 872 0 # grep "^S[lRU]" /proc/meminfo Slab: 3936832 kB SReclaimable: 399104 kB SUnreclaim: 3537728 kB After shrinking slabs (by echoing "1" to all shrink files): # grep "^S[lRU]" /proc/meminfo Slab: 1356288 kB SReclaimable: 263296 kB SUnreclaim: 1092992 kB # grep task_struct /proc/slabinfo task_struct 2764 6832 4288 61 4 : tunables 0 0 0 : slabdata 112 112 0 Link: http://lkml.kernel.org/r/20190723151445.7385-1-longman@xxxxxxxxxx Signed-off-by: Waiman Long <longman@xxxxxxxxxx> Acked-by: Roman Gushchin <guro@xxxxxx> Acked-by: Christoph Lameter <cl@xxxxxxxxx> Cc: Pekka Enberg <penberg@xxxxxxxxxx> Cc: David Rientjes <rientjes@xxxxxxxxxx> Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Shakeel Butt <shakeelb@xxxxxxxxxx> Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- Documentation/ABI/testing/sysfs-kernel-slab | 13 ++++-- mm/slab.h | 1 mm/slab_common.c | 37 ++++++++++++++++++ mm/slub.c | 2 4 files changed, 48 insertions(+), 5 deletions(-) --- a/Documentation/ABI/testing/sysfs-kernel-slab~mm-slab-extend-slab-shrink-to-shrink-all-memcg-caches +++ a/Documentation/ABI/testing/sysfs-kernel-slab @@ -429,10 +429,15 @@ KernelVersion: 2.6.22 Contact: Pekka Enberg <penberg@xxxxxxxxxxxxxx>, Christoph Lameter <cl@xxxxxxxxxxxxxxxxxxxx> Description: - The shrink file is written when memory should be reclaimed from - a cache. Empty partial slabs are freed and the partial list is - sorted so the slabs with the fewest available objects are used - first. + The shrink file is used to reclaim unused slab cache + memory from a cache. Empty per-cpu or partial slabs + are freed and the partial list is sorted so the slabs + with the fewest available objects are used first. + It only accepts a value of "1" on write for shrinking + the cache. Other input values are considered invalid. + Shrinking slab caches might be expensive and can + adversely impact other running applications. So it + should be used with care. What: /sys/kernel/slab/cache/slab_size Date: May 2007 --- a/mm/slab_common.c~mm-slab-extend-slab-shrink-to-shrink-all-memcg-caches +++ a/mm/slab_common.c @@ -981,6 +981,43 @@ int kmem_cache_shrink(struct kmem_cache } EXPORT_SYMBOL(kmem_cache_shrink); +/** + * kmem_cache_shrink_all - shrink a cache and all memcg caches for root cache + * @s: The cache pointer + */ +void kmem_cache_shrink_all(struct kmem_cache *s) +{ + struct kmem_cache *c; + + if (!IS_ENABLED(CONFIG_MEMCG_KMEM) || !is_root_cache(s)) { + kmem_cache_shrink(s); + return; + } + + get_online_cpus(); + get_online_mems(); + kasan_cache_shrink(s); + __kmem_cache_shrink(s); + + /* + * We have to take the slab_mutex to protect from the memcg list + * modification. + */ + mutex_lock(&slab_mutex); + for_each_memcg_cache(c, s) { + /* + * Don't need to shrink deactivated memcg caches. + */ + if (s->flags & SLAB_DEACTIVATED) + continue; + kasan_cache_shrink(c); + __kmem_cache_shrink(c); + } + mutex_unlock(&slab_mutex); + put_online_mems(); + put_online_cpus(); +} + bool slab_is_available(void) { return slab_state >= UP; --- a/mm/slab.h~mm-slab-extend-slab-shrink-to-shrink-all-memcg-caches +++ a/mm/slab.h @@ -174,6 +174,7 @@ int __kmem_cache_shrink(struct kmem_cach void __kmemcg_cache_deactivate(struct kmem_cache *s); void __kmemcg_cache_deactivate_after_rcu(struct kmem_cache *s); void slab_kmem_cache_release(struct kmem_cache *); +void kmem_cache_shrink_all(struct kmem_cache *s); struct seq_file; struct file; --- a/mm/slub.c~mm-slab-extend-slab-shrink-to-shrink-all-memcg-caches +++ a/mm/slub.c @@ -5294,7 +5294,7 @@ static ssize_t shrink_store(struct kmem_ const char *buf, size_t length) { if (buf[0] == '1') - kmem_cache_shrink(s); + kmem_cache_shrink_all(s); else return -EINVAL; return length; _ Patches currently in -mm which might be from longman@xxxxxxxxxx are mm-slab-extend-slab-shrink-to-shrink-all-memcg-caches.patch