Re: [PATCH 0/2] mm: skip memcg for certain address space

Michal Hocko <mhocko@xxxxxxxx> · Wed, 17 Jul 2024 18:14:16 +0200

On Wed 17-07-24 17:55:23, Vlastimil Babka (SUSE) wrote:
> Hi,
> 
> you should have Ccd people according to get_maintainers script to get a
> reply faster. Let me Cc the MEMCG section.
> 
> On 7/10/24 3:07 AM, Qu Wenruo wrote:
> > Recently I'm hitting soft lockup if adding an order 2 folio to a
> > filemap using GFP_NOFS | __GFP_NOFAIL. The softlockup happens at memcg
> > charge code, and I guess that's exactly what __GFP_NOFAIL is expected to
> > do, wait indefinitely until the request can be met.
> 
> Seems like a bug to me, as the charging of __GFP_NOFAIL in
> try_charge_memcg() should proceed to the force: part AFAICS and just go over
> the limit.
> 
> I was suspecting mem_cgroup_oom() a bit earlier return true, causing the
> retry loop, due to GFP_NOFS. But it seems out_of_memory() should be
> specifically proceeding for GFP_NOFS if it's memcg oom. But I might be
> missing something else. Anyway we should know what exactly is going first.

Correct. memcg oom code will invoke the memcg OOM killer for NOFS
requests. See out_of_memory 

        /*
         * The OOM killer does not compensate for IO-less reclaim.
         * But mem_cgroup_oom() has to invoke the OOM killer even
         * if it is a GFP_NOFS allocation.
         */
        if (!(oc->gfp_mask & __GFP_FS) && !is_memcg_oom(oc))
                return true;

That means that there will be a victim killed, charges reclaimed and
forward progress made. If there is no victim then the charging path will
bail out and overcharge.

Also the reclaim should have cond_rescheds in the reclaim path. If that
is not sufficient it should be fixed rather than workaround.
-- 
Michal Hocko
SUSE Labs