On Tue, Feb 02, 2021 at 06:28:53PM -0800, Roman Gushchin wrote: > On Tue, Feb 02, 2021 at 03:07:47PM -0800, Roman Gushchin wrote: > > On Tue, Feb 02, 2021 at 01:47:40PM -0500, Johannes Weiner wrote: > > > The memcg hotunplug callback erroneously flushes counts on the local > > > CPU, not the counts of the CPU going away; those counts will be lost. > > > > > > Flush the CPU that is actually going away. > > > > > > Also simplify the code a bit by using mod_memcg_state() and > > > count_memcg_events() instead of open-coding the upward flush - this is > > > comparable to how vmstat.c handles hotunplug flushing. > > > > To the whole series: it's really nice to have an accurate stats at > > non-leaf levels. Just as an illustration: if there are 32 CPUs and > > 1000 sub-cgroups (which is an absolutely realistic number, because > > often there are many dying generations of each cgroup), the error > > margin is 3.9GB. It makes all numbers pretty much random and all > > possible tests extremely flaky. > > Btw, I was just looking into kmem kselftests failures/flakiness, > which is caused by exactly this problem: without waiting for the > finish of dying cgroups reclaim, we can't make any reliable assumptions > about what to expect from memcg stats. Good point about the selftests. I gave them a shot, and indeed this series makes test_kmem work again: vanilla: ok 1 test_kmem_basic memory.current = 8810496 slab + anon + file + kernel_stack = 17074568 slab = 6101384 anon = 946176 file = 0 kernel_stack = 10027008 not ok 2 test_kmem_memcg_deletion ok 3 test_kmem_proc_kpagecgroup ok 4 test_kmem_kernel_stacks ok 5 test_kmem_dead_cgroups ok 6 test_percpu_basic patched: ok 1 test_kmem_basic ok 2 test_kmem_memcg_deletion ok 3 test_kmem_proc_kpagecgroup ok 4 test_kmem_kernel_stacks ok 5 test_kmem_dead_cgroups ok 6 test_percpu_basic It even passes with a reduced margin in the patched kernel, since the percpu drift - which this test already tried to account for - is now only on the page_counter side (whereas memory.stat is always precise). I'm going to include that data in the v2 changelog, as well as a patch to update test_kmem.c to the more stringent error tolerances. > So looking forward to have this patchset merged! Thanks