On Tue, Feb 02, 2021 at 03:07:47PM -0800, Roman Gushchin wrote: > On Tue, Feb 02, 2021 at 01:47:40PM -0500, Johannes Weiner wrote: > > The memcg hotunplug callback erroneously flushes counts on the local > > CPU, not the counts of the CPU going away; those counts will be lost. > > > > Flush the CPU that is actually going away. > > > > Also simplify the code a bit by using mod_memcg_state() and > > count_memcg_events() instead of open-coding the upward flush - this is > > comparable to how vmstat.c handles hotunplug flushing. > > To the whole series: it's really nice to have an accurate stats at > non-leaf levels. Just as an illustration: if there are 32 CPUs and > 1000 sub-cgroups (which is an absolutely realistic number, because > often there are many dying generations of each cgroup), the error > margin is 3.9GB. It makes all numbers pretty much random and all > possible tests extremely flaky. Btw, I was just looking into kmem kselftests failures/flakiness, which is caused by exactly this problem: without waiting for the finish of dying cgroups reclaim, we can't make any reliable assumptions about what to expect from memcg stats. So looking forward to have this patchset merged!