Re: OSD RAM usage values

Kenneth Waegeman <kenneth.waegeman@xxxxxxxx> · Tue, 28 Jul 2015 12:00:56 +0200

On 07/17/2015 02:50 PM, Gregory Farnum wrote:
On Fri, Jul 17, 2015 at 1:13 PM, Kenneth Waegeman
<kenneth.waegeman@xxxxxxxx> wrote:
Hi all,

I've read in the documentation that OSDs use around 512MB on a healthy
cluster.(http://ceph.com/docs/master/start/hardware-recommendations/#ram)
Now, our OSD's are all using around 2GB of RAM memory while the cluster is
healthy.

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
29784 root      20   0 6081276 2.535g   4740 S   0.7  8.1   1346:55 ceph-osd
32818 root      20   0 5417212 2.164g  24780 S  16.2  6.9   1238:55 ceph-osd
25053 root      20   0 5386604 2.159g  27864 S   0.7  6.9   1192:08 ceph-osd
33875 root      20   0 5345288 2.092g   3544 S   0.7  6.7   1188:53 ceph-osd
30779 root      20   0 5474832 2.090g  28892 S   1.0  6.7   1142:29 ceph-osd
22068 root      20   0 5191516 2.000g  28664 S   0.7  6.4  31:56.72 ceph-osd
34932 root      20   0 5242656 1.994g   4536 S   0.3  6.4   1144:48 ceph-osd
26883 root      20   0 5178164 1.938g   6164 S   0.3  6.2   1173:01 ceph-osd
31796 root      20   0 5193308 1.916g  27000 S  16.2  6.1 923:14.87 ceph-osd
25958 root      20   0 5193436 1.901g   2900 S   0.7  6.1   1039:53 ceph-osd
27826 root      20   0 5225764 1.845g   5576 S   1.0  5.9   1031:15 ceph-osd
36011 root      20   0 5111660 1.823g  20512 S  15.9  5.8   1093:01 ceph-osd
19736 root      20   0 2134680 0.994g      0 S   0.3  3.2  46:13.47 ceph-osd

[root@osd003 ~]# ceph status
2015-07-17 14:03:13.865063 7f1fde5f0700 -1 WARNING: the following dangerous
and experimental features are enabled: keyvaluestore
2015-07-17 14:03:13.887087 7f1fde5f0700 -1 WARNING: the following dangerous
and experimental features are enabled: keyvaluestore
     cluster 92bfcf0a-1d39-43b3-b60f-44f01b630e47
      health HEALTH_OK
      monmap e1: 3 mons at
{mds01=10.141.16.1:6789/0,mds02=10.141.16.2:6789/0,mds03=10.141.16.3:6789/0}
             election epoch 58, quorum 0,1,2 mds01,mds02,mds03
      mdsmap e17218: 1/1/1 up {0=mds03=up:active}, 1 up:standby
      osdmap e25542: 258 osds: 258 up, 258 in
       pgmap v2460163: 4160 pgs, 4 pools, 228 TB data, 154 Mobjects
             270 TB used, 549 TB / 819 TB avail
                 4152 active+clean
                    8 active+clean+scrubbing+deep

We are using erasure code on most of our OSDs, so maybe that is a reason.
But also the cache-pool filestore OSDS on 200GB SSDs are using 2GB of RAM.
Our erasure code pool (16*14 osds) have a pg_num of 2048; our cache pool
(2*14 OSDS) has a pg_num of 1024.

Are these normal values for this configuration, and is the documentation a
bit outdated, or should we look into something else?

2GB of RSS is larger than I would have expected, but not unreasonable.
In particular I don't think we've gathered numbers on either EC pools
or on the effects of the caching processes.

Which data is actually in memory of the OSDS?
Is this mostly cached data?
We are short on memory on these servers, can we have influence on this?

Thanks again!
Kenneth

-Greg

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com