On 2024/1/16 23:07, Ryan Roberts wrote:
On 16/01/2024 14:51, Matthew Wilcox wrote:
On Tue, Jan 16, 2024 at 02:35:54PM +0000, Ryan Roberts wrote:
On 16/01/2024 07:13, Kefeng Wang wrote:
In order to allocate as much as possible of large folio, move
the mem charge into alloc_anon_folio() and try the next order
if mem_cgroup_charge() fails, also we change the GFP_KERNEL
to gfp to be consistent with PMD THP.
I agree that changing gfp gives you consistency. But it's not entirely clear to
me why THP should use one set of flags for this case, and since pages another.
Why does this difference exist?
I think it needs to be spelled out much better in the changelog. Here's
my attempt at explaining why we might want this change.
mem_cgroup_charge() uses the GFP flags in a fairly sophisticated way.
In addition to checking gfpflags_allow_blocking(), it pays attention to
__GFP_NORETRY and __GFP_RETRY_MAYFAIL to ensure that processes within
this memcg do not exceed their quotas. Using the same GFP flags ensures
that we handle large anonymous folios correctly, including falling back
to smaller orders when there is plenty of memory available in the system
but this memcg is close to its limits.
Thanks for the explanation. Please add to the commit log.
Thanks, it is much better, will update, a similar change in THP, see
commit 3b3636924dfe "mm, memcg: sync allocation and memcg charge gfp
flags for THP".
Essentially you are saying that previously, all mTHP allocations would cause
reclaim from the memcg if the allocation caused the quota to be used up. But
with this change, it might now avoid that reclaim and just OOM, if the flags are
as such? So then we retry with the next lowest available size. Makes sense!
With correct GFP, we could get less reclaim and faster fallabck to next
order, that's what I want too.
... I remain not-an-expert in memcg and anonymous memory and welcome
improvements to that text.
Me too...