Re: OSD RAM usage values

Kenneth Waegeman <kenneth.waegeman@xxxxxxxx> · Wed, 29 Jul 2015 12:14:44 +0200

On 07/28/2015 04:04 PM, Dan van der Ster wrote:
On Tue, Jul 28, 2015 at 12:07 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
On Tue, Jul 28, 2015 at 11:00 AM, Kenneth Waegeman
<kenneth.waegeman@xxxxxxxx> wrote:

On 07/17/2015 02:50 PM, Gregory Farnum wrote:

On Fri, Jul 17, 2015 at 1:13 PM, Kenneth Waegeman
<kenneth.waegeman@xxxxxxxx> wrote:

Hi all,

I've read in the documentation that OSDs use around 512MB on a healthy
cluster.(http://ceph.com/docs/master/start/hardware-recommendations/#ram)
Now, our OSD's are all using around 2GB of RAM memory while the cluster
is
healthy.

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
COMMAND
29784 root      20   0 6081276 2.535g   4740 S   0.7  8.1   1346:55
ceph-osd
32818 root      20   0 5417212 2.164g  24780 S  16.2  6.9   1238:55
ceph-osd
25053 root      20   0 5386604 2.159g  27864 S   0.7  6.9   1192:08
ceph-osd
33875 root      20   0 5345288 2.092g   3544 S   0.7  6.7   1188:53
ceph-osd
30779 root      20   0 5474832 2.090g  28892 S   1.0  6.7   1142:29
ceph-osd
22068 root      20   0 5191516 2.000g  28664 S   0.7  6.4  31:56.72
ceph-osd
34932 root      20   0 5242656 1.994g   4536 S   0.3  6.4   1144:48
ceph-osd
26883 root      20   0 5178164 1.938g   6164 S   0.3  6.2   1173:01
ceph-osd
31796 root      20   0 5193308 1.916g  27000 S  16.2  6.1 923:14.87
ceph-osd
25958 root      20   0 5193436 1.901g   2900 S   0.7  6.1   1039:53
ceph-osd
27826 root      20   0 5225764 1.845g   5576 S   1.0  5.9   1031:15
ceph-osd
36011 root      20   0 5111660 1.823g  20512 S  15.9  5.8   1093:01
ceph-osd
19736 root      20   0 2134680 0.994g      0 S   0.3  3.2  46:13.47
ceph-osd

[root@osd003 ~]# ceph status
2015-07-17 14:03:13.865063 7f1fde5f0700 -1 WARNING: the following
dangerous
and experimental features are enabled: keyvaluestore
2015-07-17 14:03:13.887087 7f1fde5f0700 -1 WARNING: the following
dangerous
and experimental features are enabled: keyvaluestore
      cluster 92bfcf0a-1d39-43b3-b60f-44f01b630e47
       health HEALTH_OK
       monmap e1: 3 mons at

{mds01=10.141.16.1:6789/0,mds02=10.141.16.2:6789/0,mds03=10.141.16.3:6789/0}
              election epoch 58, quorum 0,1,2 mds01,mds02,mds03
       mdsmap e17218: 1/1/1 up {0=mds03=up:active}, 1 up:standby
       osdmap e25542: 258 osds: 258 up, 258 in
        pgmap v2460163: 4160 pgs, 4 pools, 228 TB data, 154 Mobjects
              270 TB used, 549 TB / 819 TB avail
                  4152 active+clean
                     8 active+clean+scrubbing+deep

We are using erasure code on most of our OSDs, so maybe that is a reason.
But also the cache-pool filestore OSDS on 200GB SSDs are using 2GB of
RAM.
Our erasure code pool (16*14 osds) have a pg_num of 2048; our cache pool
(2*14 OSDS) has a pg_num of 1024.

Are these normal values for this configuration, and is the documentation
a
bit outdated, or should we look into something else?

2GB of RSS is larger than I would have expected, but not unreasonable.
In particular I don't think we've gathered numbers on either EC pools
or on the effects of the caching processes.

Which data is actually in memory of the OSDS?
Is this mostly cached data?
We are short on memory on these servers, can we have influence on this?

Mmm, we've discussed this a few times on the mailing list. The CERN
guys published a document on experimenting with a very large cluster
and not enough RAM, but there's nothing I would really recommend
changing for a production system, especially an EC one, if you aren't
intimately familiar with what's going on.

In that CERN test the obvious large memory consumer was the osdmap
cache, which was so large because (a) the maps were getting quite
large (7200 OSDs creates a 4MB map, IIRC) and (b) so much osdmap churn
was leading each OSD to cache 500 of the maps. Once the cluster was
fully deployed and healthy, we could restart an OSD and it would then
only use ~300MB (because now the osdmap cache was ~empty).

Kenneth: does the memory usage shrink if you restart an osd? If so, it
could be a similar issue.

Thanks!
I tried restarting some OSDS when the cluster was healthy. Sometimes 
OSDS grow immediately back to the memory level they were having before. 
When trying again, they take about 1GB of memory, so about half. We do 
not see it going under that level, but that is maybe because of EC..

Kenneth

Cheers, Dan

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com