Hello, Yang. On Sat, May 02, 2020 at 06:27:21PM +0800, Yang Yingliang wrote: > I find the number nr_dying_descendants is increasing: > linux-dVpNUK:~ # find /sys/fs/cgroup/ -name cgroup.stat -exec grep > '^nr_dying_descendants [^0]' {} + > /sys/fs/cgroup/unified/cgroup.stat:nr_dying_descendants 80 > /sys/fs/cgroup/unified/system.slice/cgroup.stat:nr_dying_descendants 1 > /sys/fs/cgroup/unified/system.slice/system-hostos.slice/cgroup.stat:nr_dying_descendants > 1 > /sys/fs/cgroup/unified/lxc/cgroup.stat:nr_dying_descendants 79 > /sys/fs/cgroup/unified/lxc/5f1fdb8c54fa40c3e599613dab6e4815058b76ebada8a27bc1fe80c0d4801764/cgroup.stat:nr_dying_descendants > 78 > /sys/fs/cgroup/unified/lxc/5f1fdb8c54fa40c3e599613dab6e4815058b76ebada8a27bc1fe80c0d4801764/system.slice/cgroup.stat:nr_dying_descendants > 78 Those numbers are nowhere close to causing oom issues. There are some aspects of page and other cache draining which is being improved but unless you're seeing numbers multiple orders of magnitude higher, this isn't the source of your problem. > The situation is as same as the commit bd1060a1d671 ("sock, cgroup: add > sock->sk_cgroup") describes. > "On mode switch, cgroup references which are already being pointed to by > socks may be leaked." I'm doubtful that you're hitting that issue. Mode switching means memcg being switched between cgroup1 and cgroup2 hierarchies, which is unlikely to be what's happening when you're launching docker containers. The first step would be identifying where memory is going and finding out whether memcg is actually being switched between cgroup1 and 2 - look at the hierarchy number in /proc/cgroups, if that's switching between 0 and someting not zero, it is switching. Thanks. -- tejun