Re: OSD memory usage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Just a follow-up here:

I'm chasing down a bug with memory accounting.  On my luminous cluster I
am seeing lots of memory usage that is triggered by scrub.  Pretty sure
this is a bluestore cache mempool issue (making it use more memory than it
thinks it is); hopefully I'll have a fix shortly.

Recovery could trigger the same thing, but this might be something
different...

Either way, if you are seeing this usage, please capture a mempool dump:

 ceph daemon osd.NNN dump_mempools

along with the RSS size for the daemon for reference.  This info will help
me sort out whether it's the same problem or not.

Thanks!
sage


On Mon, 11 Sep 2017, bulk.schulz@xxxxxxxxxxx wrote:

> Hi Everyone,
> 
> I wonder if someone out there has a similar problem to this?
> 
> I keep having issues with memory usage.  I have 2 OSD servers wiith 48G memory
> and 12 2TB OSDs.  I seem to have significantly more memory than the minimum
> spec, but these two machines with 2TB drives seem to OOM kill and crash
> periodically -- basically any time the cluster goes into recovery for even 1
> OSD this happens.
> 
> 12 Drives * 2TB = 24 TB.  By using the 1GB RAM per 1TB Disk rule: I should
> need only 24TB or so.
> 
> I am testing and benchmarking at this time so most changes are fine.  I am
> abusing this filesystem considerably by running 14 clients with something that
> is more or less dd each to a different file but that's the point :)
> 
> When it's working, the performance is really good.  3GB/s with 3x replicated
> data pool up to around 10GB/s with 1X replication (just for kicks and giggles)
> My bottleneck is likely the SAS channels to those disks.
> 
> I'm using the 12.2.0 release running on Centos 7
> 
> Testing cephfs with one MDS and 3 montors.  The MON/MDS are not on the servers
> in question.
> 
> Total of around 350 OSDs (all spinning disk) most of which are 1TB drives on
> 15 servers that are a bit older with Xeon E5620's.
> 
> Dual QDR Infiniband (20GBit) fabrics (1 cluster and 1 client).
> 
> Any thoughts?  Am I missing some tuning parameter in /proc or something?
> 
> Thanks
> -Dave
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux