very high OSD RAM usage values

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

We experienced some serious trouble with our cluster: A running cluster started failing and started a chain reaction until the ceph cluster was down, as about half the OSDs are down (in a EC pool)

Each host has 8 OSDS of 8 TB (i.e. RAID 0 of 2 4TB disk) for an EC pool (10+3, 14 hosts) and 2 cache OSDS and 32 GB of RAM. The reason we have the Raid0 of the disks, is because we tried with 16 disk before, but 32GB didn't seem enough to keep the cluster stable

We don't know for sure what triggered the chain reaction, but what we certainly see, is that while recovering, our OSDS are using a lot of memory. We've seen some OSDS using almost 8GB of RAM (resident; virtual 11GB) So right now we don't have enough memory to recover the cluster, because the OSDS get killed by OOMkiller before they can recover..
And I don't know doubling our memory will be enough..

A few questions:

* Does someone has seen this before?
* 2GB was still normal, but 8GB seems a lot, is this expected behaviour?
* We didn't see this with an nearly empty cluster. Now it was filled about 1/4 (270TB). I guess it would become worse when filled half or more? * How high can this memory usage become ? Can we calculate the maximum memory of an OSD? Can we limit it ?
* We can upgrade/reinstall to infernalis, will that solve anything?

This is related to a previous post of me : http://permalink.gmane.org/gmane.comp.file-systems.ceph.user/22259


Thank you very much !!

Kenneth

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux