Re: [RFC PATCH] mm/slab: fix a memory leak on kobject_init_and_add() failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 10/21/24 11:14, Hyeonggon Yoo wrote:
> When kobject_init_and_add() fails during cache creation,
> kobj->name can be leaked because SLUB does not call kobject_put(),
> which should have been invoked per the kobject API documentation.
> It has a bit of historical context, though; SLUB does not call
> kobject_put() to avoid double-free for struct kmem_cache because
> 1) simply calling it would free all resources related to the cache, and
> 2) struct kmem_cache descriptor is always freed by cache_cache()'s
> error handling path, causing struct kmem_cache to be freed twice.
> 
> This issue can be reproduced by creating new slab caches while applying
> failslab for kernfs_node_cache. This makes kobject_add_varg() succeed,
> but causes kobject_add_internal() to fail in kobject_init_and_add()
> during cache creation.
> 
> Historically, this issue has attracted developers' attention several times.
> Each time a fix addressed either the leak or the double-free,
> it caused the other issue. Let's summarize a bit of history here:
> 
>   The leak has existed since the early days of SLUB.
> 
>   Commit 54b6a731025f ("slub: fix leak of 'name' in sysfs_slab_add")
>   introduced a double-free bug while fixing the leak.
> 
>   Commit 80da026a8e5d ("mm/slub: fix slab double-free in case of duplicate
>   sysfs filename") re-introduced the leak while fixing the double-free
>   error.
> 
>   Commit dde3c6b72a16 ("mm/slub: fix a memory leak in sysfs_slab_add()")
>   fixed the memory leak, but it was later reverted by commit 757fed1d0898
>   ("Revert "mm/slub: fix a memory leak in sysfs_slab_add()"") to avoid
>   the double-free error.

Thanks a lot for digging that history out.

> This is where we are now: we've chosen a memory leak over a double-free.
> So, how can this be fixed? Let's change the logic this way:
> 
>   1. Move cache name allocation from __kmem_cache_create_args() to
>   do_kmem_cache_create() and free all resources associated with the cache
>   on error in do_kmem_cache_create() to simplify error handling.
> 
>   2. In do_kmem_cache_create(), assume all resources are already freed upon
>   sysfs_slab_add() failure and do not take the error handling path.

Sounds good.

> One more thing to address when calling kobject_put() in sysfs_slab_add()
> on kobject_init_and_add() failure is that it shouldn't destroy caches
> that were already created during the boot process just because it failed to
> create sysfs files.

Yeah that would likely lead to numerous errors.

> To avoid that, use a no-op function as the release function if SLUB is
> trying to create sysfs files for those early caches.
> This ensures that resources allocated by the kobject infrastructure are
> properly freed on error while keeping the early caches active.
> 
> Note that using a no-op function as the release function is not recommended
> by the kobject API document, but I can't come up with a better approach
> here.

So what if we just don't call kobject_put() in that case? We should not be
leaking anything in that case, we just accept that the cache wasn't added to
sysfs (should we pr_warn() btw?) but it will work otherwise? Since we want
the boot caches remain around anyway, does it matter if they are also an
initialized but not linked kobject?

I think the comment "If this function returns an error, kobject_put() must
be called" means that *if* you want to destroy it due to the failure, you
must use kobject_put() and not e.g. kfree(). But IMHO it doesn't mean you
must destroy it because of the kobject_add() failure.

Seems it would be less hacky to me than the dummy release function approach.

Thanks,
Vlastimil

> Signed-off-by: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx>
> ---
>  mm/slab_common.c | 17 +++-------------
>  mm/slub.c        | 50 ++++++++++++++++++++++++++++++++++++++----------
>  2 files changed, 43 insertions(+), 24 deletions(-)
> 
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 62878edb0a81..d4ccd10f3098 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -236,14 +236,12 @@ static struct kmem_cache *create_cache(const char *name,
>  		goto out;
>  	err = do_kmem_cache_create(s, name, object_size, args, flags);
>  	if (err)
> -		goto out_free_cache;
> +		goto out;
>  
>  	s->refcount = 1;
>  	list_add(&s->list, &slab_caches);
>  	return s;
>  
> -out_free_cache:
> -	kmem_cache_free(kmem_cache, s);
>  out:
>  	return ERR_PTR(err);
>  }
> @@ -281,7 +279,6 @@ struct kmem_cache *__kmem_cache_create_args(const char *name,
>  					    slab_flags_t flags)
>  {
>  	struct kmem_cache *s = NULL;
> -	const char *cache_name;
>  	int err;
>  
>  #ifdef CONFIG_SLUB_DEBUG
> @@ -332,18 +329,10 @@ struct kmem_cache *__kmem_cache_create_args(const char *name,
>  	if (s)
>  		goto out_unlock;
>  
> -	cache_name = kstrdup_const(name, GFP_KERNEL);
> -	if (!cache_name) {
> -		err = -ENOMEM;
> -		goto out_unlock;
> -	}
> -
>  	args->align = calculate_alignment(flags, args->align, object_size);
> -	s = create_cache(cache_name, object_size, args, flags);
> -	if (IS_ERR(s)) {
> +	s = create_cache(name, object_size, args, flags);
> +	if (IS_ERR(s))
>  		err = PTR_ERR(s);
> -		kfree_const(cache_name);
> -	}
>  
>  out_unlock:
>  	mutex_unlock(&slab_mutex);
> diff --git a/mm/slub.c b/mm/slub.c
> index c95ac26dad07..1daf4de8dc33 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -330,10 +330,11 @@ struct track {
>  enum track_item { TRACK_ALLOC, TRACK_FREE };
>  
>  #ifdef SLAB_SUPPORTS_SYSFS
> -static int sysfs_slab_add(struct kmem_cache *);
> +static int sysfs_slab_add(struct kmem_cache *, bool);
>  static int sysfs_slab_alias(struct kmem_cache *, const char *);
>  #else
> -static inline int sysfs_slab_add(struct kmem_cache *s) { return 0; }
> +static inline int sysfs_slab_add(struct kmem_cache *s, bool sysfs_lazy)
> +							{ return 0; }
>  static inline int sysfs_slab_alias(struct kmem_cache *s, const char *p)
>  							{ return 0; }
>  #endif
> @@ -6137,7 +6138,12 @@ int do_kmem_cache_create(struct kmem_cache *s, const char *name,
>  {
>  	int err = -EINVAL;
>  
> -	s->name = name;
> +	s->name = kstrdup_const(name, GFP_KERNEL);
> +	if (!s->name) {
> +		err = -ENOMEM;
> +		goto out;
> +	}
> +
>  	s->size = s->object_size = size;
>  
>  	s->flags = kmem_cache_flags(flags, s->name);
> @@ -6204,16 +6210,20 @@ int do_kmem_cache_create(struct kmem_cache *s, const char *name,
>  		goto out;
>  	}
>  
> -	err = sysfs_slab_add(s);
> +	err = sysfs_slab_add(s, false);
>  	if (err)
> -		goto out;
> +		/* Resources are already freed on sysfs_slab_add() failure */
> +		return err;
>  
>  	if (s->flags & SLAB_STORE_USER)
>  		debugfs_slab_add(s);
>  
>  out:
> -	if (err)
> +	if (err) {
>  		__kmem_cache_release(s);
> +		kfree_const(s->name);
> +		kmem_cache_free(kmem_cache, s);
> +	}
>  	return err;
>  }
>  
> @@ -7162,11 +7172,22 @@ static void kmem_cache_release(struct kobject *k)
>  	slab_kmem_cache_release(to_slab(k));
>  }
>  
> +static void kmem_cache_sysfs_lazy_release(struct kobject *k) { }
> +
>  static const struct sysfs_ops slab_sysfs_ops = {
>  	.show = slab_attr_show,
>  	.store = slab_attr_store,
>  };
>  
> +/*
> + * Early slab caches shouldn't be destroyed just because SLUB failed to create
> + * sysfs files. use a no-op function as .release function.
> + */
> +static const struct kobj_type slab_sysfs_lazy_ktype = {
> +	.sysfs_ops = &slab_sysfs_ops,
> +	.release = kmem_cache_sysfs_lazy_release,
> +};
> +
>  static const struct kobj_type slab_ktype = {
>  	.sysfs_ops = &slab_sysfs_ops,
>  	.release = kmem_cache_release,
> @@ -7223,12 +7244,13 @@ static char *create_unique_id(struct kmem_cache *s)
>  	return name;
>  }
>  
> -static int sysfs_slab_add(struct kmem_cache *s)
> +static int sysfs_slab_add(struct kmem_cache *s, bool sysfs_lazy)
>  {
>  	int err;
>  	const char *name;
>  	struct kset *kset = cache_kset(s);
>  	int unmergeable = slab_unmergeable(s);
> +	const struct kobj_type *ktype;
>  
>  	if (!unmergeable && disable_higher_order_debug &&
>  			(slub_debug & DEBUG_METADATA_FLAGS))
> @@ -7253,9 +7275,14 @@ static int sysfs_slab_add(struct kmem_cache *s)
>  	}
>  
>  	s->kobj.kset = kset;
> -	err = kobject_init_and_add(&s->kobj, &slab_ktype, NULL, "%s", name);
> +	if (sysfs_lazy)
> +		ktype = &slab_sysfs_lazy_ktype;
> +	else
> +		ktype = &slab_ktype;
> +	err = kobject_init_and_add(&s->kobj, ktype, NULL, "%s", name);
> +
>  	if (err)
> -		goto out;
> +		goto out_put_kobj;
>  
>  	err = sysfs_create_group(&s->kobj, &slab_attr_group);
>  	if (err)
> @@ -7269,8 +7296,11 @@ static int sysfs_slab_add(struct kmem_cache *s)
>  	if (!unmergeable)
>  		kfree(name);
>  	return err;
> +
>  out_del_kobj:
>  	kobject_del(&s->kobj);
> +out_put_kobj:
> +	kobject_put(&s->kobj);
>  	goto out;
>  }
>  
> @@ -7337,7 +7367,7 @@ static int __init slab_sysfs_init(void)
>  	slab_state = FULL;
>  
>  	list_for_each_entry(s, &slab_caches, list) {
> -		err = sysfs_slab_add(s);
> +		err = sysfs_slab_add(s, true);
>  		if (err)
>  			pr_err("SLUB: Unable to add boot slab %s to sysfs\n",
>  			       s->name);





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux