On 16/01/2024 14:51, Matthew Wilcox wrote: > On Tue, Jan 16, 2024 at 02:35:54PM +0000, Ryan Roberts wrote: >> On 16/01/2024 07:13, Kefeng Wang wrote: >>> In order to allocate as much as possible of large folio, move >>> the mem charge into alloc_anon_folio() and try the next order >>> if mem_cgroup_charge() fails, also we change the GFP_KERNEL >>> to gfp to be consistent with PMD THP. >> >> I agree that changing gfp gives you consistency. But it's not entirely clear to >> me why THP should use one set of flags for this case, and since pages another. >> Why does this difference exist? > > I think it needs to be spelled out much better in the changelog. Here's > my attempt at explaining why we might want this change. > > mem_cgroup_charge() uses the GFP flags in a fairly sophisticated way. > In addition to checking gfpflags_allow_blocking(), it pays attention to > __GFP_NORETRY and __GFP_RETRY_MAYFAIL to ensure that processes within > this memcg do not exceed their quotas. Using the same GFP flags ensures > that we handle large anonymous folios correctly, including falling back > to smaller orders when there is plenty of memory available in the system > but this memcg is close to its limits. Thanks for the explanation. Please add to the commit log. Essentially you are saying that previously, all mTHP allocations would cause reclaim from the memcg if the allocation caused the quota to be used up. But with this change, it might now avoid that reclaim and just OOM, if the flags are as such? So then we retry with the next lowest available size. Makes sense! > > ... I remain not-an-expert in memcg and anonymous memory and welcome > improvements to that text. Me too...