On Mon 08-03-21 21:02:25, Matthew Wilcox wrote: > On Thu, Mar 04, 2021 at 07:40:53AM +0000, Zhou Guanghui wrote: > > As described in the split_page function comment, for the non-compound > > high order page, the sub-pages must be freed individually. If the > > memcg of the fisrt page is valid, the tail pages cannot be uncharged > > when be freed. > > > > For example, when alloc_pages_exact is used to allocate 1MB continuous > > physical memory, 2MB is charged(kmemcg is enabled and __GFP_ACCOUNT is > > set). When make_alloc_exact free the unused 1MB and free_pages_exact > > free the applied 1MB, actually, only 4KB(one page) is uncharged. > > > > Therefore, the memcg of the tail page needs to be set when split page. > > There's another place we need to do this to ... > > +++ b/mm/page_alloc.c > @@ -5081,9 +5081,15 @@ void __free_pages(struct page *page, unsigned int order) > { > if (put_page_testzero(page)) > free_the_page(page, order); > - else if (!PageHead(page)) > - while (order-- > 0) > - free_the_page(page + (1 << order), order); > + else if (!PageHead(page)) { > + while (order-- > 0) { > + struct page *tail = page + (1 << order); > +#ifdef CONFIG_MEMCG > + tail->memcg_data = page->memcg_data; > +#endif > + free_the_page(tail, order); > + } > + } > } > EXPORT_SYMBOL(__free_pages); Hmm, I was not aware of this code. This is really a tricky code. > I wonder if we shouldn't initialise memcg_data on all subsequent pages > of non-compound allocations instead? Because I'm not sure this is the > only place that needs to be fixed. That would be safer for sure. Do you mean this as a replacement to the original patch? diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 913c2b9e5c72..d44dea2b8d22 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3135,8 +3135,21 @@ int __memcg_kmem_charge_page(struct page *page, gfp_t gfp, int order) if (memcg && !mem_cgroup_is_root(memcg)) { ret = __memcg_kmem_charge(memcg, gfp, 1 << order); if (!ret) { + int nr_pages = 1 << order; page->memcg_data = (unsigned long)memcg | MEMCG_DATA_KMEM; + + /* + * Compound pages are normally split or freed + * via their head pages so memcg_data in in the + * head page should be sufficient but there + * are exceptions to the rule (see __free_pages). + * Non compound pages would need to copy memcg anyway. + */ + for (i = 1; i < nr_pages; i++) { + struct page * p = page + i; + p->memcg_data = page->memcg_data + } return 0; } css_put(&memcg->css); -- Michal Hocko SUSE Labs