On Thu, Jul 11, 2019 at 4:38 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > On Wed, Jul 10, 2019 at 05:27:03AM -0400, Yafang Shao wrote: > > After commit 815744d75152 ("mm: memcontrol: don't batch updates of local VM stats and events"), > > the local VM stats is not consistent with total VM stats. > > > > Bellow is one example on my server (with 8 CPUs), > > inactive_file 3567570944 > > total_inactive_file 3568029696 > > > > We can find that the deviation is very great, that is because the 'val' in > > __mod_memcg_state() is in pages while the effective value > > in memcg_stat_show() is in bytes. > > So the maximum of this deviation between local VM stats and total VM > > stats can be (32 * number_of_cpu * PAGE_SIZE), that may be an unacceptable > > great value. > > > > We should make the local VM stats consistent with the total stats. > > Although the deviation between local VM events and total events are not > > great, I think we'd better make them consistent with each other as well. > > Ha - the local stats are not percpu-fuzzy enough... But I guess that > is a valid complaint. > > > --- > > mm/memcontrol.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index ba9138a..a9448c3 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -691,12 +691,12 @@ void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val) > > if (mem_cgroup_disabled()) > > return; > > > > - __this_cpu_add(memcg->vmstats_local->stat[idx], val); > > > > x = val + __this_cpu_read(memcg->vmstats_percpu->stat[idx]); > > if (unlikely(abs(x) > MEMCG_CHARGE_BATCH)) { > > struct mem_cgroup *mi; > > > > + __this_cpu_add(memcg->vmstats_local->stat[idx], x); > > for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) > > atomic_long_add(x, &mi->vmstats[idx]); > > x = 0; > > @@ -773,12 +773,12 @@ void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, > > if (mem_cgroup_disabled()) > > return; > > > > - __this_cpu_add(memcg->vmstats_local->events[idx], count); > > > > x = count + __this_cpu_read(memcg->vmstats_percpu->events[idx]); > > if (unlikely(x > MEMCG_CHARGE_BATCH)) { > > struct mem_cgroup *mi; > > > > + __this_cpu_add(memcg->vmstats_local->events[idx], x); > > for (mi = memcg; mi; mi = parent_mem_cgroup(mi)) > > atomic_long_add(x, &mi->vmevents[idx]); > > x = 0; > > Please also update __mod_lruvec_state() to keep this behavior the same > across counters, to make sure we won't have any surprises when > switching between them. > > And please add comments explaining that we batch local counters to > keep them in sync with the hierarchical ones. Because it does look a > little odd without explanation. Sure, I will do it. Thanks Yafang