Look at http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/, there is a couple of settings about "should I consider that OSD down ?" As soon as an OSD is down, the cluster starts rebalancing, to heal itself (basically, missing object are copied to healthy OSDs) Then, maybe, the broken OSD will come back to life Here again, the cluster will rebalance, it will recreate missing object to that OSDs It will also find some "extra" object, they will be deleted At the end, you will always have an healthy cluster, unless: - you're running out of space (near-full cluster that cannot handle an OSD failure) - too many OSDs died at the same time, making autohealing unefficient: you will have "some" objects missing (if all copies were on missing OSDs, there is no way to recreate them) On 15/08/2016 11:18, kpeng@xxxxxxxxxx wrote: > hello, > > sorry I am new to ceph. > Have a question that, we have a cluster of 9 nodes, each with 12 hard > disks, one osd per disk. if one node gets down, saying 30 minutes, > during this period all replicas it has will be replicated to other > OSDes? and, when the node gets started up, how ceph handle the replicas > again? > > > thanks. > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com