Re: [PATCH] mm, vmscan: guarantee drop_slab_node() termination

Michal Hocko <mhocko@xxxxxxxx> · Thu, 19 Aug 2021 16:16:03 +0200

On Thu 19-08-21 14:21:08, Chris Down wrote:
> Vlastimil Babka writes:
> > On 8/19/21 4:55 AM, Kefeng Wang wrote:
> > > 
> > > On 2021/8/19 5:48, Chris Down wrote:
> > > > Vlastimil Babka writes:
> > > > 
> > > > I think this is a good idea, thanks for bringing it up :-)
> > > > 
> > > > I'm not sure about the bitshift idea, though. It certainly makes sure
> > > > that even large, continuous periods of reclaim eventually terminates,
> > > > but I find it hard to reason about -- for example, if there's a lot of
> > > > parallel activity, that might result in 10 constantly reintroduced
> > > > pages, or 1000 pages, and it's not immediately obvious that we should
> > > > treat those differently.
> > > > 
> > > > What about using MAX_RECLAIM_RETRIES? There's already precedent for
> > > > using it in non-OOM scenarios, like mem_cgroup_handle_over_high.
> > 
> > It's an option, but then (together with fixed threshold) it ignores how
> > large the 'freed' value is, as long it's above threshold? Although the
> > end result will probably not be much different.
> 
> Yeah, but we already draw the line at 10 right now. `freed > 10 && retries <
> MAX_RECLAIM_RETRIES` seems easier to reason about, at least to me, and stays
> closer to the current behaviour while providing a definitive point of loop
> termination.

I have to say that I am not really a fan of MAX_RECLAIM_RETRIES approach
especially for user interfaces. Any limit on retries has kicked us back
(e.g. offlining for the memory hotplug just to mention one of those).
drop_caches can take a long time on its own even without retrying. We
should teach people to interrupt those operations if they should really
finish early (e.g. timeout $TIMEOUT echo > /proc/sys/vm/drop_caches)
rather than trying to be extra clever here.

I am not against the patch Vlastimil is proposing because it replaces an
ad-hoc limit on the reclaimed objects threshold with something that is
less "random" - sort of a backoff instead seems like an improvement to
me. But I would still be worried that this could regress for some users
so in an ideal world the existing bail on signal should be enough.
-- 
Michal Hocko
SUSE Labs