On Tue 20-03-18 13:25:23, David Rientjes wrote: > On Tue, 20 Mar 2018, Michal Hocko wrote: > > > > Commit 2516035499b9 ("mm, thp: remove __GFP_NORETRY from khugepaged and > > > madvised allocations") changed the page allocator to no longer detect thp > > > allocations based on __GFP_NORETRY. > > > > > > It did not, however, modify the mem cgroup try_charge() path to avoid oom > > > kill for either khugepaged collapsing or thp faulting. It is never > > > expected to oom kill a process to allocate a hugepage for thp; reclaim is > > > governed by the thp defrag mode and MADV_HUGEPAGE, but allocations (and > > > charging) should fallback instead of oom killing processes. > > > > For some reason I thought that the charging path simply bails out for > > costly orders - effectively the same thing as for the global OOM killer. > > But we do not. Is there any reason to not do that though? Why don't we > > simply do > > > > I'm not sure of the expectation of high-order memcg charging without > __GFP_NORETRY, It should be semantically compatible with the allocation path. > I only know that khugepaged can now cause memcg oom kills > when trying to collapse memory, and then subsequently found that the same > situation exists for faulting instead of falling back to small pages. And that is clearly a bug because page allocator doesn't oom kill while the memcg charge does for the same gfp flag. That should be fixed. > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index d1a917b5b7b7..08accbcd1a18 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -1493,7 +1493,7 @@ static void memcg_oom_recover(struct mem_cgroup *memcg) > > > > static void mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order) > > { > > - if (!current->memcg_may_oom) > > + if (!current->memcg_may_oom || order > PAGE_ALLOC_COSTLY_ORDER) > > return; > > /* > > * We are in the middle of the charge context here, so we > > That may make sense as an additional patch, but for thp allocations we > don't want to retry reclaim nr_retries times anyway; we want the old > behavior of __GFP_NORETRY before commit 2516035499b9. Why? Allocation and the charge path should use the same gfp mask unless there is a strong reason for it. If you have one then please mention it in the changelog. > So the above would be a follow-up patch that wouldn't replace mine. Unless there is a strong reason to use different gfp mask for the allocation and the charge then your fix is actually wrong. -- Michal Hocko SUSE Labs