On Fri, May 8, 2020 at 3:34 AM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > On Fri, May 8, 2020 at 4:49 AM Shakeel Butt <shakeelb@xxxxxxxxxx> wrote: > > > > One way to measure the efficiency of memory reclaim is to look at the > > ratio (pgscan+pfrefill)/pgsteal. However at the moment these stats are > > not updated consistently at the system level and the ratio of these are > > not very meaningful. The pgsteal and pgscan are updated for only global > > reclaim while pgrefill gets updated for global as well as cgroup > > reclaim. > > > > Hi Shakeel, > > We always use pgscan and pgsteal for monitoring the system level > memory pressure, for example, by using sysstat(sar) or some other > monitor tools. Don't you need pgrefill in addition to pgscan and pgsteal to get the full picture of the reclaim activity? > But with this change, these two counters include the memcg pressure as > well. It is not easy to know whether the pgscan and pgsteal are caused > by system level pressure or only some specific memcgs reaching their > memory limit. > > How about adding cgroup_reclaim() to pgrefill as well ? > I am looking for all the reclaim activity on the system. Adding !cgroup_reclaim to pgrefill will skip the cgroup reclaim activity. Maybe adding pgsteal_cgroup and pgscan_cgroup would be better. > > Please note that this difference is only for system level vmstats. The > > cgroup stats returned by memory.stat are actually consistent. The > > cgroup's pgsteal contains number of reclaimed pages for global as well > > as cgroup reclaim. So, one way to get the system level stats is to get > > these stats from root's memory.stat but root does not expose that > > interface. Also for !CONFIG_MEMCG machines /proc/vmstat is the only way > > to get these stats. So, make these stats consistent. > > > > Signed-off-by: Shakeel Butt <shakeelb@xxxxxxxxxx> > > --- > > mm/vmscan.c | 6 ++---- > > 1 file changed, 2 insertions(+), 4 deletions(-) > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index cc555903a332..51f7d1efc912 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -1943,8 +1943,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, > > reclaim_stat->recent_scanned[file] += nr_taken; > > > > item = current_is_kswapd() ? PGSCAN_KSWAPD : PGSCAN_DIRECT; > > - if (!cgroup_reclaim(sc)) > > - __count_vm_events(item, nr_scanned); > > + __count_vm_events(item, nr_scanned); > > __count_memcg_events(lruvec_memcg(lruvec), item, nr_scanned); > > spin_unlock_irq(&pgdat->lru_lock); > > > > @@ -1957,8 +1956,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, > > spin_lock_irq(&pgdat->lru_lock); > > > > item = current_is_kswapd() ? PGSTEAL_KSWAPD : PGSTEAL_DIRECT; > > - if (!cgroup_reclaim(sc)) > > - __count_vm_events(item, nr_reclaimed); > > + __count_vm_events(item, nr_reclaimed); > > __count_memcg_events(lruvec_memcg(lruvec), item, nr_reclaimed); > > reclaim_stat->recent_rotated[0] += stat.nr_activate[0]; > > reclaim_stat->recent_rotated[1] += stat.nr_activate[1]; > > -- > > 2.26.2.526.g744177e7f7-goog > > > > > > > -- > Thanks > Yafang