CC Roman for the slab tracking bits On Mon, Oct 04, 2021 at 02:46:45PM +0100, Matthew Wilcox (Oracle) wrote: > @@ -537,41 +537,41 @@ static inline bool PageMemcgKmem(struct page *page) > } > > /* > - * page_objcgs - get the object cgroups vector associated with a page > - * @page: a pointer to the page struct > + * slab_objcgs - get the object cgroups vector associated with a page > + * @slab: a pointer to the slab struct > * > - * Returns a pointer to the object cgroups vector associated with the page, > - * or NULL. This function assumes that the page is known to have an > + * Returns a pointer to the object cgroups vector associated with the slab, > + * or NULL. This function assumes that the slab is known to have an > * associated object cgroups vector. It's not safe to call this function > * against pages, which might have an associated memory cgroup: e.g. > * kernel stack pages. > */ > -static inline struct obj_cgroup **page_objcgs(struct page *page) > +static inline struct obj_cgroup **slab_objcgs(struct slab *slab) > { > - unsigned long memcg_data = READ_ONCE(page->memcg_data); > + unsigned long memcg_data = READ_ONCE(slab->memcg_data); > > - VM_BUG_ON_PAGE(memcg_data && !(memcg_data & MEMCG_DATA_OBJCGS), page); > - VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, page); > + VM_BUG_ON_PAGE(memcg_data && !(memcg_data & MEMCG_DATA_OBJCGS), slab_page(slab)); > + VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, slab_page(slab)); > > return (struct obj_cgroup **)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); > } I like this whole patch series, but I think for memcg this is a particularly nice cleanup. Because right now we can have user pages pointing to a memcg, random alloc_page(GFP_ACCOUNT) pages pointing to an objcg, and slab pages pointing to an array of objcgs - all in the same memcg_data member. After your patch, slab->memcg_data points to an array of objcgs, period. The only time it doesn't is when there is a bug. Once the memcg_data member is no longer physically shared between page and slab, we can do: struct slab { struct obj_cgroup **objcgs; }; and ditch the accessor function altogether. > - * page_objcgs_check - get the object cgroups vector associated with a page > - * @page: a pointer to the page struct > + * slab_objcgs_check - get the object cgroups vector associated with a page > + * @slab: a pointer to the slab struct > * > - * Returns a pointer to the object cgroups vector associated with the page, > - * or NULL. This function is safe to use if the page can be directly associated > + * Returns a pointer to the object cgroups vector associated with the slab, > + * or NULL. This function is safe to use if the slab can be directly associated > * with a memory cgroup. > */ > -static inline struct obj_cgroup **page_objcgs_check(struct page *page) > +static inline struct obj_cgroup **slab_objcgs_check(struct slab *slab) > { > - unsigned long memcg_data = READ_ONCE(page->memcg_data); > + unsigned long memcg_data = READ_ONCE(slab->memcg_data); > > if (!memcg_data || !(memcg_data & MEMCG_DATA_OBJCGS)) > return NULL; > > - VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, page); > + VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, slab_page(slab)); > > return (struct obj_cgroup **)(memcg_data & ~MEMCG_DATA_FLAGS_MASK); This is a bit weird. The function is used in one place, to check whether a random page is a slab page. It's essentially a generic type check on the page! After your changes, you pass a struct slab that might well be invalid if this isn't a slab page, and you rely on the PAGE's memcg_data to tell you whether this is the case. It works because page->memcg_data is overlaid with slab->memcg_data, but that won't be the case if we allocate struct slab separately. To avoid that trap down the road, I think it would be better to keep the *page* the ambiguous object for now, and only resolve to struct slab after the type check. So that every time you see struct slab, you know it's valid. In fact, I think it would be best to just inline page_objcgs_check() into its sole caller. It would clarify the resolution from wildcard page to valid struct slab quite a bit: > @@ -2819,38 +2819,39 @@ int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, > */ > struct mem_cgroup *mem_cgroup_from_obj(void *p) > { > - struct page *page; > + struct slab *slab; > > if (mem_cgroup_disabled()) > return NULL; > > - page = virt_to_head_page(p); > + slab = virt_to_slab(p); > > /* > * Slab objects are accounted individually, not per-page. > * Memcg membership data for each individual object is saved in > - * the page->obj_cgroups. > + * the slab->obj_cgroups. > */ > - if (page_objcgs_check(page)) { > + if (slab_objcgs_check(slab)) { I.e. do this instead: page = virt_to_head_page(p); /* object is backed by slab */ if (page->memcg_data & MEMCG_DATA_OBJCGS) { struct slab *slab = (struct slab *)page; objcg = slab_objcgs(...)[] return objcg ? obj_cgroup_memcg(objcg): NULL; } /* object is backed by a regular kernel page */ return page_memcg_check(page); > struct obj_cgroup *objcg; > unsigned int off; > > - off = obj_to_index(page->slab_cache, page, p); > - objcg = page_objcgs(page)[off]; > + off = obj_to_index(slab->slab_cache, slab, p); > + objcg = slab_objcgs(slab)[off]; > if (objcg) > return obj_cgroup_memcg(objcg); > > return NULL; > } > > + /* I am pretty sure this could just be 'return NULL' */ No, we could still be looking at a regular page that is being tracked by memcg. People do (void *)__get_free_pages(GFP_ACCOUNT). So this needs to stay 'return page_memcg_check()'.