Re: ceph mds memory usage 20GB : is it normal ?

Brady Deetz <bdeetz@xxxxxxxxx> · Fri, 25 May 2018 10:44:04 -0500

I'm not sure this is a cache issue. To me, this feels like a memory leak. I'm now at 129GB (haven't had a window to upgrade yet) on a configured 80GB cache.
[root@mds0 ceph-admin]# ceph daemon mds.mds0 cache status
{
    "pool": {
        "items": 166753076,
        "bytes": 71766944952
    }
}

ran a 10 minute heap profile.

[root@mds0 ceph-admin]# ceph tell mds.mds0 heap start_profiler
2018-05-25 08:15:04.428519 7f3f657fa700  0 client.127046191 ms_handle_reset on 10.124.103.50:6800/2248223690
2018-05-25 08:15:04.447528 7f3f667fc700  0 client.127055541 ms_handle_reset on 10.124.103.50:6800/2248223690
mds.mds0 started profiler

[root@mds0 ceph-admin]# ceph tell mds.mds0 heap dump
2018-05-25 08:25:14.265450 7f1774ff9700  0 client.127057266 ms_handle_reset on 10.124.103.50:6800/2248223690
2018-05-25 08:25:14.356292 7f1775ffb700  0 client.127057269 ms_handle_reset on 10.124.103.50:6800/2248223690
mds.mds0 dumping heap profile now.
------------------------------------------------
MALLOC:   123658130320 (117929.6 MiB) Bytes in use by application
MALLOC: +            0 (    0.0 MiB) Bytes in page heap freelist
MALLOC: +   6969713096 ( 6646.8 MiB) Bytes in central cache freelist
MALLOC: +     26700832 (   25.5 MiB) Bytes in transfer cache freelist
MALLOC: +     54460040 (   51.9 MiB) Bytes in thread cache freelists
MALLOC: +    531034272 (  506.4 MiB) Bytes in malloc metadata
MALLOC:   ------------
MALLOC: = 131240038560 (125160.3 MiB) Actual memory used (physical + swap)
MALLOC: +   7426875392 ( 7082.8 MiB) Bytes released to OS (aka unmapped)
MALLOC:   ------------
MALLOC: = 138666913952 (132243.1 MiB) Virtual address space used
MALLOC:
MALLOC:        7434952              Spans in use
MALLOC:             20              Thread heaps in use
MALLOC:           8192              Tcmalloc page size
------------------------------------------------
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.

[root@mds0 ceph-admin]# ceph tell mds.mds0 heap stop_profiler
2018-05-25 08:25:26.394877 7fbe48ff9700  0 client.127047898 ms_handle_reset on 10.124.103.50:6800/2248223690
2018-05-25 08:25:26.736909 7fbe49ffb700  0 client.127035608 ms_handle_reset on 10.124.103.50:6800/2248223690
mds.mds0 stopped profiler

[root@mds0 ceph-admin]# pprof --pdf /bin/ceph-mds /var/log/ceph/mds.mds0.profile.000* > profile.pdf

On Thu, May 10, 2018 at 2:11 PM, Patrick Donnelly <pdonnell@xxxxxxxxxx> wrote:
On Thu, May 10, 2018 at 12:00 PM, Brady Deetz <bdeetz@xxxxxxxxx> wrote:

> [ceph-admin@mds0 ~]$ ps aux | grep ceph-mds

> ceph        1841  3.5 94.3 133703308 124425384 ? Ssl  Apr04 1808:32

> /usr/bin/ceph-mds -f --cluster ceph --id mds0 --setuser ceph --setgroup ceph

>

>

> [ceph-admin@mds0 ~]$ sudo ceph daemon mds.mds0 cache status

> {

>     "pool": {

>         "items": 173261056,

>         "bytes": 76504108600

>     }

> }

>

> So, 80GB is my configured limit for the cache and it appears the mds is

> following that limit. But, the mds process is using over 100GB RAM in my

> 128GB host. I thought I was playing it safe by configuring at 80. What other

> things consume a lot of RAM for this process?

>

> Let me know if I need to create a new thread.

The cache size measurement is imprecise pre-12.2.5 [1]. You should upgrade ASAP.

[1] https://tracker.ceph.com/issues/22972

-- 

Patrick Donnelly

Attachment:
profile.pdf

Description: Adobe PDF document
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com