Re: [Lsf-pc] [LSF/MM TOPIC] Memory cgroups, whether you like it or not

Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> · Thu, 20 Feb 2020 14:16:02 -0800

On 2/20/20 8:19 AM, Michal Hocko wrote:

>>
>> Michal, could you remind what the deal with soft limit? Why is it dead?
> 
> because of the very disruptive semantic. Essentially the way how it was
> grafted into the normal reclaim. It is essentially a priority 0 reclaim
> round to shrink a hierarchy which is the most in excess before we do a
> normal reclaim. This can lead to an over reclaim, long stalls etc.

Thanks for the explanation.  I wonder if a few factors could mitigate the
stalls in the tiered memory context:

1. The speed of demotion between top tier memory and second tier memory
is much faster than reclaiming the pages and swapping them out.

2. Demotion targets pages that are colder and less active.

3. If we engage the page demotion mostly in the background, say via kswapd,
and not in the direct reclaim path, we can avoid long stalls
during page allocation.  If the memory pressure is severe
on the top tier memory, perhaps the memory could be allocated from the second
tier memory node to avoid stalling.

The stalls could still prove to be problematic.  We're implementing
prototypes and we'll have a better ideas on workload latencies once we can collect data.

Tim