On Thu, 16 Jul 2020, Shakeel Butt wrote: > > Userspace can lack insight into the amount of memory that can be reclaimed > > from a memcg based on values from memory.stat. Two specific examples: > > > > - Lazy freeable memory (MADV_FREE) that are clean anonymous pages on the > > inactive file LRU that can be quickly reclaimed under memory pressure > > but otherwise shows up as mapped anon in memory.stat, and > > > > - Memory on deferred split queues (thp) that are compound pages that can > > be split and uncharged from the memcg under memory pressure, but > > otherwise shows up as charged anon LRU memory in memory.stat. > > > > Both of this anonymous usage is also charged to memory.current. > > > > Userspace can currently derive this information but it depends on kernel > > implementation details for how this memory is handled for the purposes of > > reclaim (anon on inactive file LRU or unmapped anon on the LRU). > > > > For the purposes of writing portable userspace code that does not need to > > have insight into the kernel implementation for reclaimable memory, this > > exports a stat that reveals the amount of anonymous memory that can be > > reclaimed and uncharged from the memcg to start new applications. > > > > As the kernel implementation evolves for memory that can be reclaimed > > under memory pressure, this stat can be kept consistent. > > > > Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx> > > --- > > Documentation/admin-guide/cgroup-v2.rst | 6 +++++ > > mm/memcontrol.c | 31 +++++++++++++++++++++++++ > > 2 files changed, 37 insertions(+) > > > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst > > --- a/Documentation/admin-guide/cgroup-v2.rst > > +++ b/Documentation/admin-guide/cgroup-v2.rst > > @@ -1296,6 +1296,12 @@ PAGE_SIZE multiple when read back. > > Amount of memory used in anonymous mappings backed by > > transparent hugepages > > > > + anon_reclaimable > > + The amount of charged anonymous memory that can be reclaimed > > + under memory pressure without swap. This currently includes > > + lazy freeable memory (MADV_FREE) and compound pages that can be > > + split and uncharged. > > + > > inactive_anon, active_anon, inactive_file, active_file, unevictable > > Amount of memory, swap-backed and filesystem-backed, > > on the internal memory management lists used by the > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -1350,6 +1350,32 @@ static bool mem_cgroup_wait_acct_move(struct mem_cgroup *memcg) > > return false; > > } > > > > +/* > > + * Returns the amount of anon memory that is charged to the memcg that is > > + * reclaimable under memory pressure without swap, in pages. > > + */ > > +static unsigned long memcg_anon_reclaimable(struct mem_cgroup *memcg) > > +{ > > + long deferred, lazyfree; > > + > > + /* > > + * Deferred pages are charged anonymous pages that are on the LRU but > > + * are unmapped. These compound pages are split under memory pressure. > > + */ > > + deferred = max_t(long, memcg_page_state(memcg, NR_ACTIVE_ANON) + > > + memcg_page_state(memcg, NR_INACTIVE_ANON) - > > + memcg_page_state(memcg, NR_ANON_MAPPED), 0); > > Please note that the NR_ANON_MAPPED does not include tmpfs memory but > NR_[IN]ACTIVE_ANON does include the tmpfs. > > > + /* > > + * Lazyfree pages are charged clean anonymous pages that are on the file > > + * LRU and can be reclaimed under memory pressure. > > + */ > > + lazyfree = max_t(long, memcg_page_state(memcg, NR_ACTIVE_FILE) + > > + memcg_page_state(memcg, NR_INACTIVE_FILE) - > > + memcg_page_state(memcg, NR_FILE_PAGES), 0); > > Similarly NR_FILE_PAGES includes tmpfs memory but NR_[IN]ACTIVE_FILE does not. > Ah, so this adds to the motivation of providing the anon_reclaimable stat because the calculation becomes even more convoluted and completely based on the kernel implementation details for both lazyfree memory and deferred split queues. Did you have a calculation in mind for memcg_anon_reclaimable()?