Hi @all, On 02/08/2017 08:45 PM, Jim Kilborn wrote: > I have had two ceph monitor nodes generate swap space alerts this week. > Looking at the memory, I see ceph-mon using a lot of memory and most of the swap space. My ceph nodes have 128GB mem, with 2GB swap (I know the memory/swap ratio is odd) I had exactly the same problem here in my little ceph cluster: - 5 nodes ceph01,02,03,04,05 on Ubuntu Trusty kernel 3.13 (kernel from the distribution). - Ceph version Jewel 10.2.9 - 4 OSDs per node - 3 montors in ceph01,02,03 - 1 active and 2 standby mds in ceph01,02,03 Yesterday, I had in _ceph02_: 1. Swap and RAM at 100%. 2. A process kswapd0 which took 100% of 1 CPU. 3. A simple "ceph status" or "ceph --version" in this node (ceph02) failed with "ImportError: librados.so.2 cannot map zero-fill pages: Cannot allocate memory". However, in the other nodes, a "ceph status" gave me a fully HEALTH_OK cluster. Maybe an important point: ceph01, ceph02, ceph03 are the same servers (hardware and conf via Puppet, 4 osd + 1 mon + 1 mds each) but the _active_ mds was hosted in ceph02 (since 2 months approximatively). The process ceph-mon in ceph02 has been oom-killed by the kernel this night and the usage of the memory is normal now. The data in the monitor working dir are really small as you can see: Filesystem Size Used Avail Use% Mounted on ceph01 => /dev/sda5 30G 126M 30G 1% /var/lib/ceph/mon/ceph-ceph01 ceph02 => /dev/sda5 30G 121M 30G 1% /var/lib/ceph/mon/ceph-ceph02 ceph03 => /dev/sda5 30G 78M 30G 1% /var/lib/ceph/mon/ceph-ceph03 It seems to me that the problem appears step by step after 2 month approximatively. It's not suddenly. Is it a known issue? Thanks for your help. -- François Lafont _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com