Re: [patch] mm, memcg: provide a stat to describe reclaimable memory

David Rientjes <rientjes@xxxxxxxxxx> · Wed, 15 Jul 2020 11:02:17 -0700 (PDT)

On Wed, 15 Jul 2020, Chris Down wrote:

> Hi David,
> 
> I'm somewhat against adding more metrics which try to approximate availability
> of memory when we already know it not to generally manifest very well in
> practice, especially since this *is* calculable by userspace (albeit with some
> knowledge of mm internals). Users and applications often vastly overestimate
> the reliability of these metrics, especially since they heavily depend on
> transient page states and whatever reclaim efficacy happens to be achieved at
> the time there is demand.
> 

Hi Chris,

With the proposed anon_reclaimable, do you have any reliability concerns?  
This would be the amount of lazy freeable memory and memory that can be 
uncharged if compound pages from the deferred split queue are split under 
memory pressure.  It seems to be a very precise value (as slab_reclaimable 
already in memory.stat is), so I'm not sure why there is a reliability 
concern.  Maybe you can elaborate?

Today, this information is indeed possible to calculate from userspace.  
The idea is to present this information that will be backwards compatible, 
however, as the kernel implementation changes.  When lazy freeable memory 
was added, for instance, userspace likely would not have preemptively been 
doing an "active_file + inactive_file - file" calculation to factor that 
in as reclaimable anon :)

> What do you intend to do with these metrics and how do you envisage other
> users should use them? Is it not possible to rework the strategy to use
> pressure information and/or workingset pressurisation instead?
> 

Previously, users would interpret their anon usage as non reclaimable if 
swap is disabled and now that value can include a *lot* of easily 
reclaimable memory.  Our users would also carefully monitor their current 
memcg usage and/or anon usage to detect abnormalities without concern for 
what is reclaimable, especially for things like deferred split queues that 
was purely a kernel implementation change.  Memcg usage and anon usag then 
becomes wildly different between kernel versions and our users alert on 
that abnormality.

The example I gave earlier in the thread showed how dramatically different 
memory.current is before and after the introduction of deferred split 
queues.  Userspace sees ballooning memcg usage and alerts on it (suspects 
a memory leak, for example) when in reality this is purely reclaimable 
memory under pressure and is the result of a kernel implementation detail.

We plan on factoring this information in when determining what the actual 
amount of memory that can and cannot be reclaimed from a memcg hierarchy 
at any given time.