Re: Ceph replication factor of 2

Jack <ceph@xxxxxxxxxxxxxx> · Thu, 24 May 2018 00:20:27 +0200

Hi,

About Bluestore, sure there are checksum, but are they fully used ?
Rumors said that on a replicated pool, during recovery, they are not

> My thoughts on the subject are that even though checksums do allow to find which replica is corrupt without having to figure which 2 out of 3 copies are the same, this is not the only reason min_size=2 was required. Even if you are running all SSD which are more reliable than HDD and are keeping the disk size small so you could backfill quickly in case of a single disk failure, you would still occasionally have longer periods of degraded operation. To name a couple - a full node going down; or operator deliberately wiping an OSD to rebuild it. min_size=1 in this case would leave you running with no redundancy at all. DR scenario with pool-to-pool mirroring probably means that you can not just replace the lost or incomplete PGs in your main site from your DR, cause DR is likely to have a different PG layout, so full resync from DR would be required in case of one disk lost during such unprotected times.

I have to say, this is a common yet worthless argument
If I have 3000 OSD, using 2 or 3 replica will not change much : the
probability of losing 2 devices is still "high"

On the other hand, if I have a small cluster, less than a hundred OSD,
that same probability become "low"

I do not buy the "if someone is making a maintenance and a device fails"
either : this is a no-limit goal: what is X servers burns at the same
time ? What if an admin make a mistake and drop 5 OSD ? What is some
network tor or routers blow away ?
Should we do one replica par OSD ?

Thus, I would like to emphasis the technical sanity of using 2 replica,
versus the organisational sanity of doing so

Organisational stuff if specific to everybody, technical is shared by
all clusters

I would like people, especially the Ceph's devs and other people who
knows how it works deeply (read the code!) to give us their advices

Regards,
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com