On Thu, Jul 16, 2020 at 1:58 PM David Rientjes <rientjes@xxxxxxxxxx> wrote: > > Userspace can lack insight into the amount of memory that can be reclaimed > from a memcg based on values from memory.stat. Two specific examples: > > - Lazy freeable memory (MADV_FREE) that are clean anonymous pages on the > inactive file LRU that can be quickly reclaimed under memory pressure > but otherwise shows up as mapped anon in memory.stat, and > > - Memory on deferred split queues (thp) that are compound pages that can > be split and uncharged from the memcg under memory pressure, but > otherwise shows up as charged anon LRU memory in memory.stat. > > Both of this anonymous usage is also charged to memory.current. > > Userspace can currently derive this information but it depends on kernel > implementation details for how this memory is handled for the purposes of > reclaim (anon on inactive file LRU or unmapped anon on the LRU). > > For the purposes of writing portable userspace code that does not need to > have insight into the kernel implementation for reclaimable memory, this > exports a stat that reveals the amount of anonymous memory that can be > reclaimed and uncharged from the memcg to start new applications. > > As the kernel implementation evolves for memory that can be reclaimed > under memory pressure, this stat can be kept consistent. > > Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx> > --- > Documentation/admin-guide/cgroup-v2.rst | 6 +++++ > mm/memcontrol.c | 31 +++++++++++++++++++++++++ > 2 files changed, 37 insertions(+) > > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst > --- a/Documentation/admin-guide/cgroup-v2.rst > +++ b/Documentation/admin-guide/cgroup-v2.rst > @@ -1296,6 +1296,12 @@ PAGE_SIZE multiple when read back. > Amount of memory used in anonymous mappings backed by > transparent hugepages > > + anon_reclaimable > + The amount of charged anonymous memory that can be reclaimed > + under memory pressure without swap. This currently includes > + lazy freeable memory (MADV_FREE) and compound pages that can be > + split and uncharged. > + > inactive_anon, active_anon, inactive_file, active_file, unevictable > Amount of memory, swap-backed and filesystem-backed, > on the internal memory management lists used by the > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1350,6 +1350,32 @@ static bool mem_cgroup_wait_acct_move(struct mem_cgroup *memcg) > return false; > } > > +/* > + * Returns the amount of anon memory that is charged to the memcg that is > + * reclaimable under memory pressure without swap, in pages. > + */ > +static unsigned long memcg_anon_reclaimable(struct mem_cgroup *memcg) > +{ > + long deferred, lazyfree; > + > + /* > + * Deferred pages are charged anonymous pages that are on the LRU but > + * are unmapped. These compound pages are split under memory pressure. > + */ > + deferred = max_t(long, memcg_page_state(memcg, NR_ACTIVE_ANON) + > + memcg_page_state(memcg, NR_INACTIVE_ANON) - > + memcg_page_state(memcg, NR_ANON_MAPPED), 0); Please note that the NR_ANON_MAPPED does not include tmpfs memory but NR_[IN]ACTIVE_ANON does include the tmpfs. > + /* > + * Lazyfree pages are charged clean anonymous pages that are on the file > + * LRU and can be reclaimed under memory pressure. > + */ > + lazyfree = max_t(long, memcg_page_state(memcg, NR_ACTIVE_FILE) + > + memcg_page_state(memcg, NR_INACTIVE_FILE) - > + memcg_page_state(memcg, NR_FILE_PAGES), 0); Similarly NR_FILE_PAGES includes tmpfs memory but NR_[IN]ACTIVE_FILE does not.