On Tue, Jul 28, 2015 at 12:07 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: > On Tue, Jul 28, 2015 at 11:00 AM, Kenneth Waegeman > <kenneth.waegeman@xxxxxxxx> wrote: >> >> >> On 07/17/2015 02:50 PM, Gregory Farnum wrote: >>> >>> On Fri, Jul 17, 2015 at 1:13 PM, Kenneth Waegeman >>> <kenneth.waegeman@xxxxxxxx> wrote: >>>> >>>> Hi all, >>>> >>>> I've read in the documentation that OSDs use around 512MB on a healthy >>>> cluster.(http://ceph.com/docs/master/start/hardware-recommendations/#ram) >>>> Now, our OSD's are all using around 2GB of RAM memory while the cluster >>>> is >>>> healthy. >>>> >>>> >>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ >>>> COMMAND >>>> 29784 root 20 0 6081276 2.535g 4740 S 0.7 8.1 1346:55 >>>> ceph-osd >>>> 32818 root 20 0 5417212 2.164g 24780 S 16.2 6.9 1238:55 >>>> ceph-osd >>>> 25053 root 20 0 5386604 2.159g 27864 S 0.7 6.9 1192:08 >>>> ceph-osd >>>> 33875 root 20 0 5345288 2.092g 3544 S 0.7 6.7 1188:53 >>>> ceph-osd >>>> 30779 root 20 0 5474832 2.090g 28892 S 1.0 6.7 1142:29 >>>> ceph-osd >>>> 22068 root 20 0 5191516 2.000g 28664 S 0.7 6.4 31:56.72 >>>> ceph-osd >>>> 34932 root 20 0 5242656 1.994g 4536 S 0.3 6.4 1144:48 >>>> ceph-osd >>>> 26883 root 20 0 5178164 1.938g 6164 S 0.3 6.2 1173:01 >>>> ceph-osd >>>> 31796 root 20 0 5193308 1.916g 27000 S 16.2 6.1 923:14.87 >>>> ceph-osd >>>> 25958 root 20 0 5193436 1.901g 2900 S 0.7 6.1 1039:53 >>>> ceph-osd >>>> 27826 root 20 0 5225764 1.845g 5576 S 1.0 5.9 1031:15 >>>> ceph-osd >>>> 36011 root 20 0 5111660 1.823g 20512 S 15.9 5.8 1093:01 >>>> ceph-osd >>>> 19736 root 20 0 2134680 0.994g 0 S 0.3 3.2 46:13.47 >>>> ceph-osd >>>> >>>> >>>> >>>> [root@osd003 ~]# ceph status >>>> 2015-07-17 14:03:13.865063 7f1fde5f0700 -1 WARNING: the following >>>> dangerous >>>> and experimental features are enabled: keyvaluestore >>>> 2015-07-17 14:03:13.887087 7f1fde5f0700 -1 WARNING: the following >>>> dangerous >>>> and experimental features are enabled: keyvaluestore >>>> cluster 92bfcf0a-1d39-43b3-b60f-44f01b630e47 >>>> health HEALTH_OK >>>> monmap e1: 3 mons at >>>> >>>> {mds01=10.141.16.1:6789/0,mds02=10.141.16.2:6789/0,mds03=10.141.16.3:6789/0} >>>> election epoch 58, quorum 0,1,2 mds01,mds02,mds03 >>>> mdsmap e17218: 1/1/1 up {0=mds03=up:active}, 1 up:standby >>>> osdmap e25542: 258 osds: 258 up, 258 in >>>> pgmap v2460163: 4160 pgs, 4 pools, 228 TB data, 154 Mobjects >>>> 270 TB used, 549 TB / 819 TB avail >>>> 4152 active+clean >>>> 8 active+clean+scrubbing+deep >>>> >>>> >>>> We are using erasure code on most of our OSDs, so maybe that is a reason. >>>> But also the cache-pool filestore OSDS on 200GB SSDs are using 2GB of >>>> RAM. >>>> Our erasure code pool (16*14 osds) have a pg_num of 2048; our cache pool >>>> (2*14 OSDS) has a pg_num of 1024. >>>> >>>> Are these normal values for this configuration, and is the documentation >>>> a >>>> bit outdated, or should we look into something else? >>> >>> >>> 2GB of RSS is larger than I would have expected, but not unreasonable. >>> In particular I don't think we've gathered numbers on either EC pools >>> or on the effects of the caching processes. >> >> >> Which data is actually in memory of the OSDS? >> Is this mostly cached data? >> We are short on memory on these servers, can we have influence on this? > > Mmm, we've discussed this a few times on the mailing list. The CERN > guys published a document on experimenting with a very large cluster > and not enough RAM, but there's nothing I would really recommend > changing for a production system, especially an EC one, if you aren't > intimately familiar with what's going on. In that CERN test the obvious large memory consumer was the osdmap cache, which was so large because (a) the maps were getting quite large (7200 OSDs creates a 4MB map, IIRC) and (b) so much osdmap churn was leading each OSD to cache 500 of the maps. Once the cluster was fully deployed and healthy, we could restart an OSD and it would then only use ~300MB (because now the osdmap cache was ~empty). Kenneth: does the memory usage shrink if you restart an osd? If so, it could be a similar issue. Cheers, Dan _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com