Re: [PATCH] bcachefs: Switch to memalloc_flags_do() for vmalloc allocations

Yafang Shao <laoar.shao@xxxxxxxxx> · Mon, 2 Sep 2024 17:01:12 +0800

On Mon, Sep 2, 2024 at 4:11 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
>
> On Mon 02-09-24 11:02:50, Yafang Shao wrote:
> > On Sun, Sep 1, 2024 at 11:35 AM Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> [...]
> > > AIUI, the memory allocation looping has back-offs already built in
> > > to it when memory reserves are exhausted and/or reclaim is
> > > congested.
> > >
> > > e.g:
> > >
> > > get_page_from_freelist()
> > >   (zone below watermark)
> > >   node_reclaim()
> > >     __node_reclaim()
> > >       shrink_node()
> > >         reclaim_throttle()
> >
> > It applies to all kinds of allocations.
> >
> > >
> > > And the call to recalim_throttle() will do the equivalent of
> > > memalloc_retry_wait() (a 2ms sleep).
> >
> > I'm wondering if we should take special action for __GFP_NOFAIL, as
> > currently, it only results in an endless loop with no intervention.
>
> If the memory allocator/reclaim is trashing on couple of remaining pages
> that are easy to drop and reallocated again then the same endless loop
> is de-facto the behavior for _all_ non-costly allocations. All of them
> will loop. This is not really great but so far we haven't really
> developed a reliable thrashing detection that would suit all potential
> workloads. There are some that simply benefit from work not being lost
> even if the cost is a severe performance penalty. A general conclusion
> has been that workloads which would rather see OOM killer triggering
> early should implement that policy in the userspace. We have PSI,
> refault counters and other tools that could be used to detect
> pathological patterns and trigger workload specific action.

Indeed, we're currently working on developing that policy.

>
> I really do not see why GFP_NOFAIL should be any special in this
> specific case.

I believe there's no way to stop it from looping, even if you
implement a sophisticated user space OOM killer. ;)

--
Regards
Yafang