The patch titled Subject: mm: memory: move mem_cgroup_charge() into alloc_anon_folio() has been added to the -mm mm-unstable branch. Its filename is mm-memory-move-mem_cgroup_charge-into-alloc_anon_folio.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-memory-move-mem_cgroup_charge-into-alloc_anon_folio.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> Subject: mm: memory: move mem_cgroup_charge() into alloc_anon_folio() Date: Wed, 17 Jan 2024 18:39:54 +0800 The GFP flags from vma_thp_gfp_mask() according to user configuration only used for large folio allocation but not for memory cgroup charge, and GFP_KERNEL is used for both order-0 and large order folio when memory cgroup charge at present. However, mem_cgroup_charge() uses the GFP flags in a fairly sophisticated way. In addition to checking gfpflags_allow_blocking(), it pays attention to __GFP_NORETRY and __GFP_RETRY_MAYFAIL to ensure that processes within this memcg do not exceed their quotas. So we'd better to move mem_cgroup_charge() into alloc_anon_folio(), 1) it will make us to allocate as much as possible large order folio, because we could try the next order if mem_cgroup_charge() fails, although the memcg's memory usage is close to its limits. 2) using same GFP flags for allocation and charge is to be consistent with PMD THP firstly, in addition, according to GFP flag returned from vma_thp_gfp_mask(), GFP_TRANSHUGE_LIGHT could make us skip direct reclaim, _GFP_NORETRY will make us skip mem_cgroup_oom() and won't trigger memory cgroup oom from large order(order <= COSTLY_ORDER) folio charging. Link: https://lkml.kernel.org/r/20240122011612.501029-1-wangkefeng.wang@xxxxxxxxxx Link: https://lkml.kernel.org/r/20240117103954.2756050-1-wangkefeng.wang@xxxxxxxxxx Signed-off-by: Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> Reviewed-by: Ryan Roberts <ryan.roberts@xxxxxxx> Cc: David Hildenbrand <david@xxxxxxxxxx> Cc: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxx> Cc: Roman Gushchin <roman.gushchin@xxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Shakeel Butt <shakeelb@xxxxxxxxxx> Cc: Muchun Song <songmuchun@xxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/memory.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) --- a/mm/memory.c~mm-memory-move-mem_cgroup_charge-into-alloc_anon_folio +++ a/mm/memory.c @@ -4153,8 +4153,8 @@ static bool pte_range_none(pte_t *pte, i static struct folio *alloc_anon_folio(struct vm_fault *vmf) { -#ifdef CONFIG_TRANSPARENT_HUGEPAGE struct vm_area_struct *vma = vmf->vma; +#ifdef CONFIG_TRANSPARENT_HUGEPAGE unsigned long orders; struct folio *folio; unsigned long addr; @@ -4206,15 +4206,21 @@ static struct folio *alloc_anon_folio(st addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << order); folio = vma_alloc_folio(gfp, order, vma, addr, true); if (folio) { + if (mem_cgroup_charge(folio, vma->vm_mm, gfp)) { + folio_put(folio); + goto next; + } + folio_throttle_swaprate(folio, gfp); clear_huge_page(&folio->page, vmf->address, 1 << order); return folio; } +next: order = next_order(&orders, order); } fallback: #endif - return vma_alloc_zeroed_movable_folio(vmf->vma, vmf->address); + return folio_prealloc(vma->vm_mm, vma, vmf->address, true); } /* @@ -4281,10 +4287,6 @@ static vm_fault_t do_anonymous_page(stru nr_pages = folio_nr_pages(folio); addr = ALIGN_DOWN(vmf->address, nr_pages * PAGE_SIZE); - if (mem_cgroup_charge(folio, vma->vm_mm, GFP_KERNEL)) - goto oom_free_page; - folio_throttle_swaprate(folio, GFP_KERNEL); - /* * The memory barrier inside __folio_mark_uptodate makes sure that * preceding stores to the page contents become visible before @@ -4338,8 +4340,6 @@ unlock: release: folio_put(folio); goto unlock; -oom_free_page: - folio_put(folio); oom: return VM_FAULT_OOM; } _ Patches currently in -mm which might be from wangkefeng.wang@xxxxxxxxxx are mm-memory-use-nth_page-in-clear-copy_subpage.patch s390-use-pfn_swap_entry_folio-in-ptep_zap_swap_entry.patch mm-use-pfn_swap_entry_folio-in-__split_huge_pmd_locked.patch mm-use-pfn_swap_entry_to_folio-in-zap_huge_pmd.patch mm-use-pfn_swap_entry_folio-in-copy_nonpresent_pte.patch mm-convert-to-should_zap_page-to-should_zap_folio.patch mm-convert-to-should_zap_page-to-should_zap_folio-fix.patch mm-convert-mm_counter-to-take-a-folio.patch mm-convert-mm_counter_file-to-take-a-folio.patch mm-memory-move-mem_cgroup_charge-into-alloc_anon_folio.patch