Hi, I'm also new, but I'll try to help. IMHO most of the pros here would be quite worried about this cluster if it is production: -A prod ceph cluster should not be run with size=2 min_size=1, because: --In case of a down'ed osd / host the cluster could have problems determining which data is the correct when the osd/host came back up --If an osd dies, the others get more io (has to compensate the lost io capacity and the rebuilding too) which can instantly kill another close to death disc (not with ceph, but with raid i have been there) --If an osd dies ANY other osd serving that pool has well placed
inconsistency, like bitrot you'll lose data
-There are not enough hosts in your setup, or rather the discs are not distributed well: --If an osd / host dies, the cluster trys to repair itself and relocate the data onto another host. In your config there is no other host to reallocate data to if ANY of the hosts fail (I guess that hdds and ssds are separated)
-The disks should nod be placed in raid arrays if it can be avoided especially raid0: --You multiply the possibility of an un-recoverable disc error (and since the data is striped) the other disks data is unrecoverable too --When an osd dies, the cluster should relocate the data onto
another osd. When this happens now there is double the data that
need to be moved, this causes 2 problems: Recovery time / io, and
free space. The cluster should have enough free space to
reallocate data to, in this setup you cannot do that in case of a
host dies (see above), but in case an osd dies, ceph would try to
replicate the data onto other osds in the machine. So you have to
have enough free space on >>the same host<< in this
setup to replicate data to.
In your case, I would recommend: -Introducing (and activating) a fourth osd host -setting size=3 min_size=2 -After data migration is done, one-by-one separating the raid0 arrays: (remove, split) -> (zap, init, add) separately, in such a manner that hdds and ssds are evenly distributed across the servers -Always keeping that much free space, so the cluster could lose a
host and still has space to repair (calculating with the repair
max usage % setting).
I hope this helps, and please keep in mind that I'm a noob too :) Denes. On 12/04/2017 10:07 AM, tim taler
wrote:
|
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com