MDS allocates all memory (>500G) replaying, OOM-killed, repeat

"Pickett, Neale T" <neale@xxxxxxxx> · Mon, 1 Apr 2019 15:03:20 +0000

Hello

We are experiencing an issue where our ceph MDS gobbles up 500G of RAM, is killed by the kernel, dies, then repeats. We have 3 MDS daemons on different machines, and all are exhibiting this behavior. We are running the following versions (from Docker):

ceph/daemon:v3.2.1-stable-3.2-luminous-centos-7

ceph/daemon:v3.2.1-stable-3.2-luminous-centos-7

ceph/daemon:v3.1.0-stable-3.1-luminous-centos-7 (downgraded in last-ditch effort to resolve, didn't help)

The machines hosting the MDS instances have 512G RAM. We tried adding swap, and the MDS just started eating into the swap (and got really slow, eventually being kicked out for exceeding the mds_beacon_grace of 240). mds_cache_memory_limit has been many
 values ranging from 200G to the default of 1073741824, and the result of replay is always the same: keep allocating memory until the kernel OOM killer stops it (or the mds_beacon_grace period expires, if swap is enabled).

Before it died, the active MDS reported 1.592 million inodes to Prometheus (ceph_mds_inodes) and 1.493 million caps (ceph_mds_caps).

This appears to be the same problem as http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-October/030872.html

At this point I feel like my best option is to try to destroy the journal and hope things come back, but while we can probably recover from this, I'd like to prevent it happening in the future. Any advice?

Neale Pickett <neale@xxxxxxxx>

A-4: Advanced Research in Cyber Systems

Los Alamos National Laboratory

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com