Subject: + slab-get_online_mems-for-kmem_cache_createdestroyshrink.patch added to -mm tree To: vdavydov@xxxxxxxxxxxxx,cl@xxxxxxxxx,isimatu.yasuaki@xxxxxxxxxxxxxx,laijs@xxxxxxxxxxxxxx,liuj97@xxxxxxxxx,penberg@xxxxxxxxxx,qiuxishi@xxxxxxxxxx,rafael.j.wysocki@xxxxxxxxx,rientjes@xxxxxxxxxx,tangchen@xxxxxxxxxxxxxx,toshi.kani@xxxxxx,wency@xxxxxxxxxxxxxx,zhangyanfei@xxxxxxxxxxxxxx From: akpm@xxxxxxxxxxxxxxxxxxxx Date: Fri, 18 Apr 2014 13:18:48 -0700 The patch titled Subject: slab: get_online_mems for kmem_cache_{create,destroy,shrink} has been added to the -mm tree. Its filename is slab-get_online_mems-for-kmem_cache_createdestroyshrink.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/slab-get_online_mems-for-kmem_cache_createdestroyshrink.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/slab-get_online_mems-for-kmem_cache_createdestroyshrink.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Vladimir Davydov <vdavydov@xxxxxxxxxxxxx> Subject: slab: get_online_mems for kmem_cache_{create,destroy,shrink} When we create a sl[au]b cache, we allocate kmem_cache_node structures for each online NUMA node. To handle nodes taken online/offline, we register memory hotplug notifier and allocate/free kmem_cache_node corresponding to the node that changes its state for each kmem cache. To synchronize between the two paths we hold the slab_mutex during both the cache creationg/destruction path and while tuning per-node parts of kmem caches in memory hotplug handler, but that's not quite right, because it does not guarantee that a newly created cache will have all kmem_cache_nodes initialized in case it races with memory hotplug. For instance, in case of slub: CPU0 CPU1 ---- ---- kmem_cache_create: online_pages: __kmem_cache_create: slab_memory_callback: slab_mem_going_online_callback: lock slab_mutex for each slab_caches list entry allocate kmem_cache node unlock slab_mutex lock slab_mutex init_kmem_cache_nodes: for_each_node_state(node, N_NORMAL_MEMORY) allocate kmem_cache node add kmem_cache to slab_caches list unlock slab_mutex online_pages (continued): node_states_set_node As a result we'll get a kmem cache with not all kmem_cache_nodes allocated. To avoid issues like that we should hold get/put_online_mems() during the whole kmem cache creation/destruction/shrink paths, just like we deal with cpu hotplug. This patch does the trick. Note, that after it's applied, there is no need in taking the slab_mutex for kmem_cache_shrink any more, so it is removed from there. Signed-off-by: Vladimir Davydov <vdavydov@xxxxxxxxxxxxx> Cc: Christoph Lameter <cl@xxxxxxxxx> Cc: Pekka Enberg <penberg@xxxxxxxxxx> Cc: Tang Chen <tangchen@xxxxxxxxxxxxxx> Cc: Zhang Yanfei <zhangyanfei@xxxxxxxxxxxxxx> Cc: Toshi Kani <toshi.kani@xxxxxx> Cc: Xishi Qiu <qiuxishi@xxxxxxxxxx> Cc: Jiang Liu <liuj97@xxxxxxxxx> Cc: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> Cc: David Rientjes <rientjes@xxxxxxxxxx> Cc: Wen Congyang <wency@xxxxxxxxxxxxxx> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@xxxxxxxxxxxxxx> Cc: Lai Jiangshan <laijs@xxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/slab.c | 26 ++------------------------ mm/slab.h | 1 + mm/slab_common.c | 35 +++++++++++++++++++++++++++++++++-- mm/slob.c | 3 +-- mm/slub.c | 5 ++--- 5 files changed, 39 insertions(+), 31 deletions(-) diff -puN mm/slab.c~slab-get_online_mems-for-kmem_cache_createdestroyshrink mm/slab.c --- a/mm/slab.c~slab-get_online_mems-for-kmem_cache_createdestroyshrink +++ a/mm/slab.c @@ -2474,8 +2474,7 @@ out: return nr_freed; } -/* Called with slab_mutex held to protect against cpu hotplug */ -static int __cache_shrink(struct kmem_cache *cachep) +int __kmem_cache_shrink(struct kmem_cache *cachep) { int ret = 0, i = 0; struct kmem_cache_node *n; @@ -2496,32 +2495,11 @@ static int __cache_shrink(struct kmem_ca return (ret ? 1 : 0); } -/** - * kmem_cache_shrink - Shrink a cache. - * @cachep: The cache to shrink. - * - * Releases as many slabs as possible for a cache. - * To help debugging, a zero exit status indicates all slabs were released. - */ -int kmem_cache_shrink(struct kmem_cache *cachep) -{ - int ret; - BUG_ON(!cachep || in_interrupt()); - - get_online_cpus(); - mutex_lock(&slab_mutex); - ret = __cache_shrink(cachep); - mutex_unlock(&slab_mutex); - put_online_cpus(); - return ret; -} -EXPORT_SYMBOL(kmem_cache_shrink); - int __kmem_cache_shutdown(struct kmem_cache *cachep) { int i; struct kmem_cache_node *n; - int rc = __cache_shrink(cachep); + int rc = __kmem_cache_shrink(cachep); if (rc) return rc; diff -puN mm/slab.h~slab-get_online_mems-for-kmem_cache_createdestroyshrink mm/slab.h --- a/mm/slab.h~slab-get_online_mems-for-kmem_cache_createdestroyshrink +++ a/mm/slab.h @@ -91,6 +91,7 @@ __kmem_cache_alias(const char *name, siz #define CACHE_CREATE_MASK (SLAB_CORE_FLAGS | SLAB_DEBUG_FLAGS | SLAB_CACHE_FLAGS) int __kmem_cache_shutdown(struct kmem_cache *); +int __kmem_cache_shrink(struct kmem_cache *); struct seq_file; struct file; diff -puN mm/slab_common.c~slab-get_online_mems-for-kmem_cache_createdestroyshrink mm/slab_common.c --- a/mm/slab_common.c~slab-get_online_mems-for-kmem_cache_createdestroyshrink +++ a/mm/slab_common.c @@ -205,6 +205,8 @@ kmem_cache_create(const char *name, size int err; get_online_cpus(); + get_online_mems(); + mutex_lock(&slab_mutex); err = kmem_cache_sanity_check(name, size); @@ -239,6 +241,8 @@ kmem_cache_create(const char *name, size out_unlock: mutex_unlock(&slab_mutex); + + put_online_mems(); put_online_cpus(); if (err) { @@ -272,6 +276,8 @@ void kmem_cache_create_memcg(struct mem_ char *cache_name; get_online_cpus(); + get_online_mems(); + mutex_lock(&slab_mutex); /* @@ -295,6 +301,8 @@ void kmem_cache_create_memcg(struct mem_ out_unlock: mutex_unlock(&slab_mutex); + + put_online_mems(); put_online_cpus(); } @@ -322,6 +330,8 @@ static int kmem_cache_destroy_memcg_chil void kmem_cache_destroy(struct kmem_cache *s) { get_online_cpus(); + get_online_mems(); + mutex_lock(&slab_mutex); s->refcount--; @@ -350,15 +360,36 @@ void kmem_cache_destroy(struct kmem_cach memcg_free_cache_params(s); kfree(s->name); kmem_cache_free(kmem_cache, s); - goto out_put_cpus; + goto out; out_unlock: mutex_unlock(&slab_mutex); -out_put_cpus: +out: + put_online_mems(); put_online_cpus(); } EXPORT_SYMBOL(kmem_cache_destroy); +/** + * kmem_cache_shrink - Shrink a cache. + * @cachep: The cache to shrink. + * + * Releases as many slabs as possible for a cache. + * To help debugging, a zero exit status indicates all slabs were released. + */ +int kmem_cache_shrink(struct kmem_cache *cachep) +{ + int ret; + + get_online_cpus(); + get_online_mems(); + ret = __kmem_cache_shrink(cachep); + put_online_mems(); + put_online_cpus(); + return ret; +} +EXPORT_SYMBOL(kmem_cache_shrink); + int slab_is_available(void) { return slab_state >= UP; diff -puN mm/slob.c~slab-get_online_mems-for-kmem_cache_createdestroyshrink mm/slob.c --- a/mm/slob.c~slab-get_online_mems-for-kmem_cache_createdestroyshrink +++ a/mm/slob.c @@ -620,11 +620,10 @@ int __kmem_cache_shutdown(struct kmem_ca return 0; } -int kmem_cache_shrink(struct kmem_cache *d) +int __kmem_cache_shrink(struct kmem_cache *d) { return 0; } -EXPORT_SYMBOL(kmem_cache_shrink); struct kmem_cache kmem_cache_boot = { .name = "kmem_cache", diff -puN mm/slub.c~slab-get_online_mems-for-kmem_cache_createdestroyshrink mm/slub.c --- a/mm/slub.c~slab-get_online_mems-for-kmem_cache_createdestroyshrink +++ a/mm/slub.c @@ -3422,7 +3422,7 @@ EXPORT_SYMBOL(kfree); * being allocated from last increasing the chance that the last objects * are freed in them. */ -int kmem_cache_shrink(struct kmem_cache *s) +int __kmem_cache_shrink(struct kmem_cache *s) { int node; int i; @@ -3478,7 +3478,6 @@ int kmem_cache_shrink(struct kmem_cache kfree(slabs_by_inuse); return 0; } -EXPORT_SYMBOL(kmem_cache_shrink); static int slab_mem_going_offline_callback(void *arg) { @@ -3486,7 +3485,7 @@ static int slab_mem_going_offline_callba mutex_lock(&slab_mutex); list_for_each_entry(s, &slab_caches, list) - kmem_cache_shrink(s); + __kmem_cache_shrink(s); mutex_unlock(&slab_mutex); return 0; _ Patches currently in -mm which might be from vdavydov@xxxxxxxxxxxxx are slub-fix-memcg_propagate_slab_attrs.patch slb-charge-slabs-to-kmemcg-explicitly.patch mm-get-rid-of-__gfp_kmemcg.patch mm-get-rid-of-__gfp_kmemcg-fix.patch slab-document-kmalloc_order.patch memcg-un-export-__memcg_kmem_get_cache.patch mem-hotplug-implement-get-put_online_mems.patch slab-get_online_mems-for-kmem_cache_createdestroyshrink.patch -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html