Re: Ceph replication factor of 2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If you are so worried about the storage efficiency: why not use erasure coding?
EC performs really well with Luminous in our experience.
Yes, you generate more IOPS and somewhat more CPU load and a higher latency.
But it's often worth a try.

Simple example for everyone considering 2/1 replicas: consider 2/2 erasure coding.

* Data durability and availability of 3/2 replicas
* Storage efficiency of 2/1 replicas
* 33% more write IOPS than 3/2 replicas
* 100% more read IOPS than any replica setup (400% more to reduce latency with fast_read)

Of course, 2/2 erasure coding might seem stupid. We typically use 4/2, 5/2, or 5/3.

So If you are worried about reducing storage overhead: try it out and see for yourself how it performs
for your use case.

I've rescued several clusters that were configured with 2/1 replica and broke down in various ways... it's not pretty
and can be annoying and time-consuming to fix. As in tracking down a broken disk where the OSD doesn't start
up properly and trying get the last copy of a PG off it with ceph-objectstore-tool...



Paul


2018-05-25 9:48 GMT+02:00 Janne Johansson <icepic.dz@xxxxxxxxx>:


Den fre 25 maj 2018 kl 00:20 skrev Jack <ceph@xxxxxxxxxxxxxx>:
On 05/24/2018 11:40 PM, Stefan Kooman wrote:
>> What are your thoughts, would you run 2x replication factor in
>> Production and in what scenarios?
Me neither, mostly because I have yet to read a technical point of view,
from someone who read and understand the code

I do not buy Janne's "trust me, I am an engineer", whom btw confirmed
that the "replica 3" stuff is subject to probability and function to the
cluster size, thus is not a generic "always-true" rule

I did not call for trust on _my_ experience or value, but on the ones posting the
first "everyone should probably use 3 replicas" over which you showed doubt.
I agree with them, but did not intend to claim that my post had extra value because
it was written by me.

Also, the last part of my post was very much intended to add "not everything in 3x is true for everyone",
but if you value your data, it would be very prudent to listen to experienced people who took risks and lost data before.



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux