Re: Showing /sys/fs/cgroup/memory/memory.stat very slow on some machines

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hello guys


excuse me please for dropping in, but I can not ignore the fact that all this sounds like 99%+ the same
as the issue I am going nuts with for the past 2 months, since I switched kernels from version 3 to 4.

Please look at the topic `Caching/buffers become useless after some time`. What I did not mention there
is that cgroups are also mounted and used, but not actively since I have some scripting issue with setting
them up correctly, but there is active data in /sys/fs/cgroup/memory/memory.stat so it might be related to
cgroups - I did not think of that until now.

same story here as well, 2> into drop_caches solves the issue temporarily, for maybe 2-4 days with lots of I/O.

I can however test and play around with cgroups - if one may want to suggest to disable them I'd gladly
monitor the behavior (please tell me what and how to do it, if necessary). Also I am curious: could you disable
cgroups as well, just to see whether it helps and is actually associated with cgroups? my sysctl regarding vm is:

vm.dirty_ratio = 15
vm.dirty_background_ratio = 3
vm.vfs_cache_pressure = 1

I may tell (not for sure) that this issue is less significant since I lowered these values, previously I had
90/80 on dirty_ratio and dirty_background_ratio, not sure about the cache pressue any more.
Still there is lots of ram unallocated, usually at least half, mostly even more totally unused, the hosts
have 64GB of RAM as well.

I hope this is kinda related, so we can work together on pinpointing this, that issue is not going away
for me and causes lots of headache slowing down my entire business.

2018-07-24 12:05 GMT+02:00 Bruce Merry <bmerry@xxxxxxxxx>:
On 18 July 2018 at 19:40, Bruce Merry <bmerry@xxxxxxxxx> wrote:
>> Yes, very easy to produce zombies, though I don't think kernel
>> provides any way to tell how many zombies exist on the system.
>>
>> To create a zombie, first create a memcg node, enter that memcg,
>> create a tmpfs file of few KiBs, exit the memcg and rmdir the memcg.
>> That memcg will be a zombie until you delete that tmpfs file.
>
> Thanks, that makes sense. I'll see if I can reproduce the issue.

Hi

I've had some time to experiment with this issue, and I've now got a
way to reproduce it fairly reliably, including with a stock 4.17.8
kernel. However, it's very phase-of-the-moon stuff, and even
apparently trivial changes (like switching the order in which the
files are statted) makes the issue disappear.

To reproduce:
1. Start cadvisor running. I use the 0.30.2 binary from Github, and
run it with sudo ./cadvisor-0.30.2 --logtostderr=true
2. Run the Python 3 script below, which repeatedly creates a cgroup,
enters it, stats some files in it, and leaves it again (and removes
it). It takes a few minutes to run.
3. time cat /sys/fs/cgroup/memory/memory.stat. It now takes about 20ms for me.
4. sudo sysctl vm.drop_caches=2
5. time cat /sys/fs/cgroup/memory/memory.stat. It is back to 1-2ms.

I've also added some code to memcg_stat_show to report the number of
cgroups in the hierarchy (iterations in for_each_mem_cgroup_tree).
Running the script increases it from ~700 to ~41000. The script
iterates 250,000 times, so only some fraction of the cgroups become
zombies.

I also tried the suggestion of force_empty: it makes the problem go
away, but is also very, very slow (about 0.5s per iteration), and
given the sensitivity of the test to small changes I don't know how
meaningful that is.

Reproduction code (if you have tqdm installed you get a nice progress
bar, but not required). Hopefully Gmail doesn't do any format
mangling:


#!/usr/bin/env python3
import os

try:
    from tqdm import trange as range
except ImportError:
    pass


def clean():
    try:
        os.rmdir(name)
    except FileNotFoundError:
        pass


def move_to(cgroup):
    with open(cgroup + '/tasks', 'w') as f:
        print(pid, file=f)


pid = os.getpid()
os.chdir('/sys/fs/cgroup/memory')
name = 'dummy'
N = 250000
clean()
try:
    for i in range(N):
        os.mkdir(name)
        move_to(name)
        for filename in ['memory.stat', 'memory.swappiness']:
            os.stat(os.path.join(name, filename))
        move_to('user.slice')
        os.rmdir(name)
finally:
    move_to('user.slice')
    clean()


Regards
Bruce
--
Bruce Merry
Senior Science Processing Developer
SKA South Africa



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux