On Tue, Apr 28, 2020 at 05:10:33PM +0800, Yang Yingliang wrote: > Hi, > > On 2020/4/28 1:24, Roman Gushchin wrote: > > On Mon, Apr 27, 2020 at 01:13:04PM -0400, Johannes Weiner wrote: > > > +cc Roman who has been looking the most at this area > > > > > > On Mon, Apr 27, 2020 at 03:48:13PM +0800, Yang Yingliang wrote: > > > > +cc linux-mm@xxxxxxxxx <mailto:linux-mm@xxxxxxxxx> > > > > > > > > On 2020/4/26 19:21, Yang Yingliang wrote: > > > > > Hi, > > > > > > > > > > When I doing the follow test in kernel-5.7-rc2, I found mem-free is > > > > > decreased > > > > > > > > > > #!/bin/sh > > > > > cd /sys/fs/cgroup/memory/ > > > > > > > > > > for((i=0;i<45;i++)) > > > > > do > > > > > for((j=0;j<60000;j++)) > > > > > do > > > > > mkdir /sys/fs/cgroup/memory/yyl-cg$j > > > > > done > > > > > sleep 1 > > > > > ls /sys/fs/cgroup/memory/ | grep yyl | xargs rmdir > > > > > done > > > Should be easy enough to reproduce, thanks for the report. I'll try to > > > take a look later this week, unless Roman beats me to it. > > > > > > Is this a new observation in 5.7-rc2? > > > > > > Can you provide /sys/fs/cgroup/unified/cgroup.stat after the test? > I re-tested in 5.7-rc3, it has same problem and the > /sys/fs/cgroup/unified/cgroup.stat afther test is: > > nr_descendants 50 > nr_dying_descendants 0 > > > I'm actually trying to reproduce it now, but so far haven't found any issues. > > > > Yang, can you, please, attach the config you're using? > > > > And also confirm that you're giving the system some time before looking > > at the memory statistics? Reclaim of internal cgroup structure is a complex > > process which might take some time to finish. > > > > Is dmesg also clean? > config and dmesg are attached. Interesting... I've tried hard to reproduce, but haven't managed to get anything so far. You've a huge machine with a non-trivial hardware setup (e.g. those cpuset warnings in dmesg), so there must be something special about it. Could be anything: percpu, rcu, some incorrectly handled ENOMEM case. I've several questions, which might help the investigation: 1) Is there any known good revision, which doesn't leak? 2) Can you, please, check that all those cgroups were actually created? I mean if some mkdir calls returned an error, it could be a huge hint for us where to look. 3) Can you, please, dump /proc/slabinfo before and after the experiment? 4) Can you, please, repeat the experiment creating cgroups in /sys/fs/cgroup/unified instead of /sys/fs/cgroup/memory ? 5) If you're familiar with bcc tools (https://github.com/iovisor/bcc), can you, please, run memleak.py during the experiment? Thank you!