Re: [Lsf-pc] [LSF/MM TOPIC] Memory cgroups, whether you like it or not

Michal Hocko <mhocko@xxxxxxxxxx> · Fri, 21 Feb 2020 09:42:41 +0100

On Thu 20-02-20 14:16:02, Tim Chen wrote:
> On 2/20/20 8:19 AM, Michal Hocko wrote:
> 
> >>
> >> Michal, could you remind what the deal with soft limit? Why is it dead?
> > 
> > because of the very disruptive semantic. Essentially the way how it was
> > grafted into the normal reclaim. It is essentially a priority 0 reclaim
> > round to shrink a hierarchy which is the most in excess before we do a
> > normal reclaim. This can lead to an over reclaim, long stalls etc.
> 
> Thanks for the explanation.  I wonder if a few factors could mitigate the
> stalls in the tiered memory context:
> 
> 1. The speed of demotion between top tier memory and second tier memory
> is much faster than reclaiming the pages and swapping them out.

You could have accumulated a lot of soft limit excess before it is
reclaimed. So I do not think the speed of the demotion is the primary
factor.

> 2. Demotion targets pages that are colder and less active.
> 
> 3. If we engage the page demotion mostly in the background, say via kswapd,
> and not in the direct reclaim path, we can avoid long stalls
> during page allocation.  If the memory pressure is severe
> on the top tier memory, perhaps the memory could be allocated from the second
> tier memory node to avoid stalling.
> 
> The stalls could still prove to be problematic.  We're implementing
> prototypes and we'll have a better ideas on workload latencies once we can collect data.

I would really encourage you to not hook into the soft limit reclaim
even if you somehow manage to reduce the problem with stalls for at
least three reasons
1) soft limit is not going to be added to cgroup v2 because there is a
   different API to achieve a pro-active reclaim
2) soft limit is not aware of the memory you are reclaiming so using it
   for tiered memory sounds like a bad fit to me.
3) changing the semantic of the existing interface is always
   troublesome. Please have to look into mailing list archives when we
   have attempted that the last time.
   Have a look at e.g. http://lkml.kernel.org/r/1371557387-22434-1-git-send-email-mhocko@xxxxxxx

Anyway it is hard to comment on without knowing details on how you
actually want to use soft limit for different memory types and their
balancing.
-- 
Michal Hocko
SUSE Labs