OSD memory usage

bulk.schulz@xxxxxxxxxxx · Mon, 11 Sep 2017 11:38:04 -0600

Hi Everyone,

I wonder if someone out there has a similar problem to this?

I keep having issues with memory usage.  I have 2 OSD servers wiith 48G 
memory and 12 2TB OSDs.  I seem to have significantly more memory than 
the minimum spec, but these two machines with 2TB drives seem to OOM 
kill and crash periodically -- basically any time the cluster goes into 
recovery for even 1 OSD this happens.

12 Drives * 2TB = 24 TB.  By using the 1GB RAM per 1TB Disk rule: I 
should need only 24TB or so.

I am testing and benchmarking at this time so most changes are fine.  I 
am abusing this filesystem considerably by running 14 clients with 
something that is more or less dd each to a different file but that's 
the point :)

When it's working, the performance is really good.  3GB/s with 3x 
replicated data pool up to around 10GB/s with 1X replication (just for 
kicks and giggles) My bottleneck is likely the SAS channels to those disks.

I'm using the 12.2.0 release running on Centos 7

Testing cephfs with one MDS and 3 montors.  The MON/MDS are not on the 
servers in question.

Total of around 350 OSDs (all spinning disk) most of which are 1TB 
drives on 15 servers that are a bit older with Xeon E5620's.

Dual QDR Infiniband (20GBit) fabrics (1 cluster and 1 client).

Any thoughts?  Am I missing some tuning parameter in /proc or something?

Thanks
-Dave
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com