On Thu, Sep 24, 2020 at 04:14:17PM -0400, Johannes Weiner wrote: > On Tue, Sep 22, 2020 at 01:37:00PM -0700, Roman Gushchin wrote: > > PageKmemcg flag is currently defined as a page type (like buddy, > > offline, table and guard). Semantically it means that the page > > was accounted as a kernel memory by the page allocator and has > > to be uncharged on the release. > > > > As a side effect of defining the flag as a page type, the accounted > > page can't be mapped to userspace (look at page_has_type() and > > comments above). In particular, this blocks the accounting of > > vmalloc-backed memory used by some bpf maps, because these maps > > do map the memory to userspace. > > > > One option is to fix it by complicating the access to page->mapcount, > > which provides some free bits for page->page_type. > > > > But it's way better to move this flag into page->memcg_data flags. > > Indeed, the flag makes no sense without enabled memory cgroups > > and memory cgroup pointer set in particular. > > > > This commit replaces PageKmemcg() and __SetPageKmemcg() with > > PageMemcgKmem() and SetPageMemcgKmem(). __ClearPageKmemcg() > > can be simple deleted because clear_page_mem_cgroup() already > > does the job. > > > > As a bonus, on !CONFIG_MEMCG build the PageMemcgKmem() check will > > be compiled out. > > > > Signed-off-by: Roman Gushchin <guro@xxxxxx> > > That sounds good to me! Great! > > > --- > > include/linux/memcontrol.h | 58 ++++++++++++++++++++++++++++++++++++-- > > include/linux/page-flags.h | 11 ++------ > > mm/memcontrol.c | 14 +++------ > > mm/page_alloc.c | 2 +- > > 4 files changed, 62 insertions(+), 23 deletions(-) > > > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > > index 9a49f1e1c0c7..390db58500d5 100644 > > --- a/include/linux/memcontrol.h > > +++ b/include/linux/memcontrol.h > > @@ -346,8 +346,14 @@ extern struct mem_cgroup *root_mem_cgroup; > > enum page_memcg_flags { > > /* page->memcg_data is a pointer to an objcgs vector */ > > PG_MEMCG_OBJ_CGROUPS, > > + /* page has been accounted as a non-slab kernel page */ > > + PG_MEMCG_KMEM, > > + /* the next bit after the last actual flag */ > > + PG_MEMCG_LAST_FLAG, > > *_NR_FLAGS would be customary. Ok, __NR_PAGE_MEMCG_FLAGS ? Similar to __NR_PAGE_FLAGS. > > > }; > > > > +#define MEMCG_FLAGS_MASK ((1UL << PG_MEMCG_LAST_FLAG) - 1) > > Probably best to stick to the same prefix as the enum items. You mean PG_MEMCG_FLAGS_MASK? > > > + * PageMemcgKmem - check if the page has MemcgKmem flag set > > + * @page: a pointer to the page struct > > + * > > + * Checks if the page has MemcgKmem flag set. The caller must ensure that > > + * the page has an associated memory cgroup. It's not safe to call this function > > + * against some types of pages, e.g. slab pages. > > + */ > > +static inline bool PageMemcgKmem(struct page *page) > > +{ > > + VM_BUG_ON_PAGE(test_bit(PG_MEMCG_OBJ_CGROUPS, &page->memcg_data), page); > > + return test_bit(PG_MEMCG_KMEM, &page->memcg_data); > > +} > > + > > +/* > > + * SetPageMemcgKmem - set the page's MemcgKmem flag > > + * @page: a pointer to the page struct > > + * > > + * Set the page's MemcgKmem flag. The caller must ensure that the page has > > + * an associated memory cgroup. It's not safe to call this function > > + * against some types of pages, e.g. slab pages. > > + */ > > +static inline void SetPageMemcgKmem(struct page *page) > > +{ > > + VM_BUG_ON_PAGE(!page->memcg_data, page); > > + VM_BUG_ON_PAGE(test_bit(PG_MEMCG_OBJ_CGROUPS, &page->memcg_data), page); > > + __set_bit(PG_MEMCG_KMEM, &page->memcg_data); > > It may be good to keep the __ prefix from __SetPageMemcg as long as > this uses __set_bit, in case we later add atomic bit futzing. Yeah, I agree. I though about it. Maybe not so useful now, but more future-proof. Thanks!