On 17/01/2019 14:17, Arkadiusz Miśkiewicz wrote: > On 17/01/2019 13:25, Aleksa Sarai wrote: >> On 2019-01-17, Arkadiusz Miśkiewicz <a.miskiewicz@xxxxxxxxx> wrote: >>> Using kernel 4.19.13. >>> >>> For one cgroup I noticed weird behaviour: >>> >>> # cat pids.current >>> 60 >>> # cat cgroup.procs >>> # >> >> Are there any zombies in the cgroup? pids.current is linked up directly >> to __put_task_struct (so exit(2) won't decrease it, only the task_struct >> actually being freed will decrease it). >> > > There are no zombie processes. > > In mean time the problem shows on multiple servers and so far saw it > only in cgroups that were OOMed. > > What has changed on these servers (yesterday) is turning on > memory.oom.group=1 for all cgroups and changing memory.high from 1G to > "max" (leaving memory.max=2G limit only). > > Previously there was no such problem. > I'm attaching reproducer. This time tried on different distribution kernel (arch linux). After 60s pids.current still shows 37 processes even if there are no processes running (according to ps aux). [root@warm ~]# uname -a Linux warm 4.20.3-arch1-1-ARCH #1 SMP PREEMPT Wed Jan 16 22:38:58 UTC 2019 x86_64 GNU/Linux [root@warm ~]# python3 cg.py Created cgroup: /sys/fs/cgroup/test_26207 Start: pids.current: 0 Start: cgroup.procs: 0: pids.current: 62 0: cgroup.procs: 1: pids.current: 37 1: cgroup.procs: 2: pids.current: 37 2: cgroup.procs: 3: pids.current: 37 3: cgroup.procs: 4: pids.current: 37 4: cgroup.procs: 5: pids.current: 37 5: cgroup.procs: 6: pids.current: 37 6: cgroup.procs: 7: pids.current: 37 7: cgroup.procs: 8: pids.current: 37 8: cgroup.procs: 9: pids.current: 37 9: cgroup.procs: 10: pids.current: 37 10: cgroup.procs: 11: pids.current: 37 11: cgroup.procs: -- Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )
#!/usr/bin/python3 import os import sys import time cgdir="/sys/fs/cgroup/test_%d" % os.getpid() print("Created cgroup: %s" % cgdir) os.mkdir(cgdir) open(os.path.join(cgdir, "cgroup.subtree_control"), 'w').write("+pids\n+io\n+memory") open(os.path.join(cgdir, "pids.max"), 'w').write("60\n") open(os.path.join(cgdir, "memory.oom.group"), 'w').write("1\n"); open(os.path.join(cgdir, "memory.max"), 'w').write("512M\n") print("Start: pids.current: %s" % (open("%s/pids.current" % cgdir, "r").read().strip())) print("Start: cgroup.procs: %s" % (open("%s/cgroup.procs" % cgdir, "r").read().strip())) def run_job(cgdir): open(os.path.join(cgdir, "cgroup.procs"), 'w').write("%d\n" % os.getpid()) a = "" while True: a += "a" * (10 * 1024 * 1024) sys.exit(0) jobs = 1000 children = [] father = True for job in range(0, jobs): child = os.fork() if child: children.append(child) else: father = False run_job(cgdir) sys.exit(0) for child in children: os.waitpid(child, 0) for i in range(0, 12): print("%d: pids.current: %s" % (i, open("%s/pids.current" % cgdir, "r").read().strip())) print("%d: cgroup.procs: %s" % (i, open("%s/cgroup.procs" % cgdir, "r").read().strip())) time.sleep(5)