We've had this for a while. We just monitor memory usage and restart the mon services when 1 or more reach 80%.
Sent from my iPhone
> On Nov 18, 2016, at 3:35 AM, Corin Langosch <corin.langosch@xxxxxxxxxxx> wrote:
>
> Hi,
>
> about 2 weeks ago I upgraded a rather small cluster from ceph 0.94.2 to 0.94.9. The upgrade went fine, the cluster is running stable. But I just noticed that one monitor is already eating 20 GB of memory, growing slowly over time. The other 2 mons look fine. The disk space used by the problematic mon looks fine too.
>
> ceph -w
> cluster 4ac0e21b-6ea2-4ac7-8114-122bd9ba55d6
> health HEALTH_OK
> monmap e5: 3 mons at {a=10.0.0.5:6789/0,b=10.0.0.6:6789/0,c=10.0.0.7:6789/0}
> election epoch 856, quorum 0,1,2 a,b,c
> osdmap e27838: 19 osds: 9 up, 9 in
> pgmap v104865438: 4096 pgs, 1 pools, 3225 GB data, 809 kobjects
> 6463 GB used, 1955 GB / 8419 GB avail
> 4096 active+clean
> client io 1484 kB/s rd, 14217 kB/s wr, 2026 op/s
>
>
> ps aux | grep ceph
> root 5958 1.2 15.7 21194296 20821228 ? Sl Nov05 238:33 /usr/bin/ceph-mon -i c --pid-file /var/run/ceph/mon.c.pid -c /etc/ceph/ceph.conf --cluster ceph
> root 7457 38.5 0.3 1660028 509008 ? Ssl Nov05 7125:00 /usr/bin/ceph-osd -i 18 --pid-file /var/run/ceph/osd.18.pid -c /etc/ceph/ceph.conf --cluster ceph
> root 7981 25.8 0.4 1661064 543684 ? Ssl Nov05 4775:44 /usr/bin/ceph-osd -i 9 --pid-file /var/run/ceph/osd.9.pid -c /etc/ceph/ceph.conf --cluster ceph
>
>
> ps aux | grep ceph
> root 12704 2.0 0.1 532184 93468 ? Sl Nov05 374:16 /usr/bin/ceph-mon -i a --pid-file /var/run/ceph/mon.a.pid -c /etc/ceph/ceph.conf --cluster ceph
> root 14587 32.3 0.8 1682720 581752 ? Ssl Nov05 5970:53 /usr/bin/ceph-osd -i 16 --pid-file /var/run/ceph/osd.16.pid -c /etc/ceph/ceph.conf --cluster ceph
> root 14919 28.8 0.7 1680144 526052 ? Ssl Nov05 5328:17 /usr/bin/ceph-osd -i 3 --pid-file /var/run/ceph/osd.3.pid -c /etc/ceph/ceph.conf --cluster ceph
>
>
> ps aux | grep ceph
> root 19114 1.3 0.2 508000 157724 ? Sl Nov05 256:53 /usr/bin/ceph-mon -i b --pid-file /var/run/ceph/mon.b.pid -c /etc/ceph/ceph.conf --cluster ceph
> root 20155 34.3 0.7 1652784 516072 ? Ssl Nov05 6341:37 /usr/bin/ceph-osd -i 17 --pid-file /var/run/ceph/osd.17.pid -c /etc/ceph/ceph.conf --cluster ceph
> root 20508 33.2 0.7 1666276 510496 ? Ssl Nov05 6135:04 /usr/bin/ceph-osd -i 6 --pid-file /var/run/ceph/osd.6.pid -c /etc/ceph/ceph.conf --cluster ceph
>
>
> dpkg -l | grep ceph
> ii ceph 0.94.9-1trusty amd64 distributed storage and file system
> ii ceph-common 0.94.9-1trusty amd64 common utilities to mount and interact with a ceph storage cluster
> ii ceph-fs-common 0.94.9-1trusty amd64 common utilities to mount and interact with a ceph file system
> ii ceph-fuse 0.94.9-1trusty amd64 FUSE-based client for the Ceph distributed file system
> ii ceph-mds 0.94.9-1trusty amd64 metadata server for the ceph distributed file system
> ii ceph-test 0.94.9-1trusty amd64 Ceph test and benchmarking tools
> ii libcephfs1 0.94.9-1trusty amd64 Ceph distributed file system client library
> ii python-cephfs 0.94.9-1trusty amd64 Python libraries for the Ceph libcephfs library
>
> /var/lib/ceph/mon/ceph-c# du -h
> 108K ./auth
> 88K ./auth_gv
> 2.1M ./logm
> 2.0M ./logm_gv
> 24K ./mdsmap
> 4.0K ./mdsmap_gv
> 4.0K ./mkfs
> 40K ./monmap
> 20K ./monmap_gv
> 4.2M ./osdmap
> 15M ./osdmap_full
> 4.2M ./osdmap_gv
> 13M ./pgmap
> 2.0M ./pgmap_gv
> 201M ./store.db
> 243M .
>
> Is this a known bug? Any workaround or fix available?
>
> Thanks
> Corin
>
Sent from my iPhone
> On Nov 18, 2016, at 3:35 AM, Corin Langosch <corin.langosch@xxxxxxxxxxx> wrote:
>
> Hi,
>
> about 2 weeks ago I upgraded a rather small cluster from ceph 0.94.2 to 0.94.9. The upgrade went fine, the cluster is running stable. But I just noticed that one monitor is already eating 20 GB of memory, growing slowly over time. The other 2 mons look fine. The disk space used by the problematic mon looks fine too.
>
> ceph -w
> cluster 4ac0e21b-6ea2-4ac7-8114-122bd9ba55d6
> health HEALTH_OK
> monmap e5: 3 mons at {a=10.0.0.5:6789/0,b=10.0.0.6:6789/0,c=10.0.0.7:6789/0}
> election epoch 856, quorum 0,1,2 a,b,c
> osdmap e27838: 19 osds: 9 up, 9 in
> pgmap v104865438: 4096 pgs, 1 pools, 3225 GB data, 809 kobjects
> 6463 GB used, 1955 GB / 8419 GB avail
> 4096 active+clean
> client io 1484 kB/s rd, 14217 kB/s wr, 2026 op/s
>
>
> ps aux | grep ceph
> root 5958 1.2 15.7 21194296 20821228 ? Sl Nov05 238:33 /usr/bin/ceph-mon -i c --pid-file /var/run/ceph/mon.c.pid -c /etc/ceph/ceph.conf --cluster ceph
> root 7457 38.5 0.3 1660028 509008 ? Ssl Nov05 7125:00 /usr/bin/ceph-osd -i 18 --pid-file /var/run/ceph/osd.18.pid -c /etc/ceph/ceph.conf --cluster ceph
> root 7981 25.8 0.4 1661064 543684 ? Ssl Nov05 4775:44 /usr/bin/ceph-osd -i 9 --pid-file /var/run/ceph/osd.9.pid -c /etc/ceph/ceph.conf --cluster ceph
>
>
> ps aux | grep ceph
> root 12704 2.0 0.1 532184 93468 ? Sl Nov05 374:16 /usr/bin/ceph-mon -i a --pid-file /var/run/ceph/mon.a.pid -c /etc/ceph/ceph.conf --cluster ceph
> root 14587 32.3 0.8 1682720 581752 ? Ssl Nov05 5970:53 /usr/bin/ceph-osd -i 16 --pid-file /var/run/ceph/osd.16.pid -c /etc/ceph/ceph.conf --cluster ceph
> root 14919 28.8 0.7 1680144 526052 ? Ssl Nov05 5328:17 /usr/bin/ceph-osd -i 3 --pid-file /var/run/ceph/osd.3.pid -c /etc/ceph/ceph.conf --cluster ceph
>
>
> ps aux | grep ceph
> root 19114 1.3 0.2 508000 157724 ? Sl Nov05 256:53 /usr/bin/ceph-mon -i b --pid-file /var/run/ceph/mon.b.pid -c /etc/ceph/ceph.conf --cluster ceph
> root 20155 34.3 0.7 1652784 516072 ? Ssl Nov05 6341:37 /usr/bin/ceph-osd -i 17 --pid-file /var/run/ceph/osd.17.pid -c /etc/ceph/ceph.conf --cluster ceph
> root 20508 33.2 0.7 1666276 510496 ? Ssl Nov05 6135:04 /usr/bin/ceph-osd -i 6 --pid-file /var/run/ceph/osd.6.pid -c /etc/ceph/ceph.conf --cluster ceph
>
>
> dpkg -l | grep ceph
> ii ceph 0.94.9-1trusty amd64 distributed storage and file system
> ii ceph-common 0.94.9-1trusty amd64 common utilities to mount and interact with a ceph storage cluster
> ii ceph-fs-common 0.94.9-1trusty amd64 common utilities to mount and interact with a ceph file system
> ii ceph-fuse 0.94.9-1trusty amd64 FUSE-based client for the Ceph distributed file system
> ii ceph-mds 0.94.9-1trusty amd64 metadata server for the ceph distributed file system
> ii ceph-test 0.94.9-1trusty amd64 Ceph test and benchmarking tools
> ii libcephfs1 0.94.9-1trusty amd64 Ceph distributed file system client library
> ii python-cephfs 0.94.9-1trusty amd64 Python libraries for the Ceph libcephfs library
>
> /var/lib/ceph/mon/ceph-c# du -h
> 108K ./auth
> 88K ./auth_gv
> 2.1M ./logm
> 2.0M ./logm_gv
> 24K ./mdsmap
> 4.0K ./mdsmap_gv
> 4.0K ./mkfs
> 40K ./monmap
> 20K ./monmap_gv
> 4.2M ./osdmap
> 15M ./osdmap_full
> 4.2M ./osdmap_gv
> 13M ./pgmap
> 2.0M ./pgmap_gv
> 201M ./store.db
> 243M .
>
> Is this a known bug? Any workaround or fix available?
>
> Thanks
> Corin
>
David Turner |
Cloud Operations Engineer |
StorageCraft
Technology Corporation 380 Data Drive Suite 300 | Draper | Utah | 84020 Office: 801.871.2760 | Mobile: 385.224.2943 |
If you are not the intended recipient of this message or received it erroneously, please notify the sender and delete it, together with any attachments, and be advised that any dissemination or copying of this message is prohibited. |
_______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com