The largest cluster for now?

han vincent <hangzws@xxxxxxxxx> · Thu, 10 Nov 2016 19:17:35 +0800



Hello, all:
    Recently, I have a plan to build a large-scale ceph cluster in
production for Openstack. I want to build the  cluster as larger as
possible.
    In the following maillist, Karol has asked a question about
"largest ceph cluster":
        http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-April/028371.html
    In this maillist, dreamhost and CERN said they both had build a
3-PB cluster.

    In the last few days, I had read the CERN's report "Ceph ~30PB Test Report"
         https://cds.cern.ch/record/2015206/files/CephScaleTestMarch2015.pdf
    In order to build such a large cluster, the guys of CERN had made
some changes:
    1. Set noin, noup flags before osds to be activate to avoid osdmap
from changing frequently
    2. Do as the following configurations, the memory consumption of
OSD and monitor daemons will decrease from ~2GB to ~500MB

    [global]
      osd map message max=10
    [osd]
      osd map cache size=20
      osd map max advance=10
      osd map share max epochs=10
      osd pg epoch persisted max stale=10

    3. ADD SSDs to monitors, because the monitors are overloaded with
too many OSD creation transactions
    4. Upgraded the verion of ceph to Hammer to avoid leveldb from
increasing rapidly.
    it seems that CERN's 30-PB cluster is for test only and not yet in
production environment?
    I wonder to know on the current situation, how large the cluster
is the best fit for the production environment. 3-PB? 30-PB? or
bigger?
    And if a large-scale cluster has been build, how to maintain such
a large cluster in the latter days?
    What's the core issues of the large cluster, and what can we do to
avoid the potential problems?
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com