On Mon, Oct 12, 2020 at 09:03:26AM +0530, Bharata B Rao wrote: > On Fri, Oct 09, 2020 at 11:45:51AM -0700, Roman Gushchin wrote: > > On Fri, Oct 09, 2020 at 11:34:23AM +0530, Bharata B Rao wrote: > > > > Hi Bharata, > > > > > Object cgroup charging is done for all the objects during > > > allocation, but during freeing, uncharging ends up happening > > > for only one object in the case of bulk allocation/freeing. > > > > Yes, it's definitely a problem. Thank you for catching it! > > > > I'm curious, did you find it in the wild or by looking into the code? > > Found by looking at the code. > > > > > > > > > Fix this by having a separate call to uncharge all the > > > objects from kmem_cache_free_bulk() and by modifying > > > memcg_slab_free_hook() to take care of bulk uncharging. > > > > > > Signed-off-by: Bharata B Rao <bharata@xxxxxxxxxxxxx> > > > > Please, add: > > > > Fixes: 964d4bd370d5 ("mm: memcg/slab: save obj_cgroup for non-root slab objects") > > Cc: stable@xxxxxxxxxxxxxxx > > > > > --- > > > mm/slab.c | 2 +- > > > mm/slab.h | 42 +++++++++++++++++++++++++++--------------- > > > mm/slub.c | 3 ++- > > > 3 files changed, 30 insertions(+), 17 deletions(-) > > > > > > diff --git a/mm/slab.c b/mm/slab.c > > > index f658e86ec8cee..5c70600d8b1cc 100644 > > > --- a/mm/slab.c > > > +++ b/mm/slab.c > > > @@ -3440,7 +3440,7 @@ void ___cache_free(struct kmem_cache *cachep, void *objp, > > > memset(objp, 0, cachep->object_size); > > > kmemleak_free_recursive(objp, cachep->flags); > > > objp = cache_free_debugcheck(cachep, objp, caller); > > > - memcg_slab_free_hook(cachep, virt_to_head_page(objp), objp); > > > + memcg_slab_free_hook(cachep, &objp, 1); > > > > > > /* > > > * Skip calling cache_free_alien() when the platform is not numa. > > > diff --git a/mm/slab.h b/mm/slab.h > > > index 6cc323f1313af..6dd4b702888a7 100644 > > > --- a/mm/slab.h > > > +++ b/mm/slab.h > > > @@ -345,30 +345,42 @@ static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s, > > > obj_cgroup_put(objcg); > > > } > > > > > > -static inline void memcg_slab_free_hook(struct kmem_cache *s, struct page *page, > > > - void *p) > > > +static inline void memcg_slab_free_hook(struct kmem_cache *s_orig, > > > + void **p, int objects) > > > { > > > + struct kmem_cache *s; > > > struct obj_cgroup *objcg; > > > + struct page *page; > > > unsigned int off; > > > + int i; > > > > > > if (!memcg_kmem_enabled()) > > > return; > > > > > > - if (!page_has_obj_cgroups(page)) > > > - return; > > > + for (i = 0; i < objects; i++) { > > > + if (unlikely(!p[i])) > > > + continue; > > > > > > - off = obj_to_index(s, page, p); > > > - objcg = page_obj_cgroups(page)[off]; > > > - page_obj_cgroups(page)[off] = NULL; > > > + page = virt_to_head_page(p[i]); > > > + if (!page_has_obj_cgroups(page)) > > > + continue; > > > > > > - if (!objcg) > > > - return; > > > + if (!s_orig) > > > + s = page->slab_cache; > > > + else > > > + s = s_orig; > > > > > > - obj_cgroup_uncharge(objcg, obj_full_size(s)); > > > - mod_objcg_state(objcg, page_pgdat(page), cache_vmstat_idx(s), > > > - -obj_full_size(s)); > > > + off = obj_to_index(s, page, p[i]); > > > + objcg = page_obj_cgroups(page)[off]; > > > + if (!objcg) > > > + continue; > > > > > > - obj_cgroup_put(objcg); > > > + page_obj_cgroups(page)[off] = NULL; > > > + obj_cgroup_uncharge(objcg, obj_full_size(s)); > > > + mod_objcg_state(objcg, page_pgdat(page), cache_vmstat_idx(s), > > > + -obj_full_size(s)); > > > + obj_cgroup_put(objcg); > > > + } > > > } > > > > > > #else /* CONFIG_MEMCG_KMEM */ > > > @@ -406,8 +418,8 @@ static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s, > > > { > > > } > > > > > > -static inline void memcg_slab_free_hook(struct kmem_cache *s, struct page *page, > > > - void *p) > > > +static inline void memcg_slab_free_hook(struct kmem_cache *s, > > > + void **p, int objects) > > > { > > > } > > > #endif /* CONFIG_MEMCG_KMEM */ > > > diff --git a/mm/slub.c b/mm/slub.c > > > index 6d3574013b2f8..0cbe67f13946e 100644 > > > --- a/mm/slub.c > > > +++ b/mm/slub.c > > > @@ -3091,7 +3091,7 @@ static __always_inline void do_slab_free(struct kmem_cache *s, > > > struct kmem_cache_cpu *c; > > > unsigned long tid; > > > > > > - memcg_slab_free_hook(s, page, head); > > > + memcg_slab_free_hook(s, &head, 1); > > > > Hm, I wonder if it's better to teach do_slab_free() to handle the (cnt > 1) case correctly? > > Possible, but we will have to walk the detached freelist there while > here it is much easier to just walk the array of objects? > > > > > > redo: > > > /* > > > * Determine the currently cpus per cpu slab. > > > @@ -3253,6 +3253,7 @@ void kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p) > > > if (WARN_ON(!size)) > > > return; > > > > > > + memcg_slab_free_hook(s, p, size); > > > > Then you don't need this change. > > > > Otherwise memcg_slab_free_hook() can be called twice for the same object. It's ok from > > accounting correctness perspective, because the first call will zero out the objcg pointer, > > but still much better to be avoided. > > Yes, for that one object there will be one additional uncharge call, > but as you note it is handled by the !objcg check. Got it. Thanks for the explanation! Acked-by: Roman Gushchin <guro@xxxxxx> Thanks!