Re: [PATCH] bcachefs: Switch to memalloc_flags_do() for vmalloc allocations

Michal Hocko <mhocko@xxxxxxxx> · Tue, 3 Sep 2024 09:18:31 +0200

On Tue 03-09-24 14:34:05, Yafang Shao wrote:
> On Mon, Sep 2, 2024 at 5:09 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
> >
> > On Mon 02-09-24 17:01:12, Yafang Shao wrote:
> > > > I really do not see why GFP_NOFAIL should be any special in this
> > > > specific case.
> > >
> > > I believe there's no way to stop it from looping, even if you
> > > implement a sophisticated user space OOM killer. ;)
> >
> > User space OOM killer should be helping to replenish a free memory and
> > we have some heuristics to help NOFAIL users out with some portion of
> > memory reserves already IIRC. So we do already give them some special
> > treatment in the page allocator path. Not so much in the reclaim path.
> 
> When setting GFP_NOFAIL, it's important to not only enable direct
> reclaim but also the OOM killer. In scenarios where swap is off and
> there is minimal page cache, setting GFP_NOFAIL without __GFP_FS can
> result in an infinite loop. In other words, GFP_NOFAIL should not be
> used with GFP_NOFS. Unfortunately, many call sites do combine them.

This is the case with GFP_NOFS on its own already. NOFAIL is no
different and both will be looping for ever. We heavily rely on kswapd
or other GFP_KERNEL's direct reclaim to allow for forward progress.

Unfortunatelly we haven't really found a better way to deal with NOFS
only/predominant workloads.

-- 
Michal Hocko
SUSE Labs