Replication factor of 50 on a 1000 OSD node cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 16, 2014 at 5:10 PM, JIten Shah <jshah2005 at me.com> wrote:
> Hi Guys,
>
> We have a cluster with 1000 OSD nodes and 5 MON nodes and 1 MDS node. In order to be able to loose quite a few OSD?s and still survive the load, we were thinking of making the replication factor to 50.
>
> Is that too big of a number? what is the performance implications and any other issues that we should consider before setting it to that. Also, do we need the same number of metadata copies too or it can be less?

Don't do that. Every write has to be synchronously copied to every
replica, so 50x replication will give you very high latencies and very
low write bandwidth to each object. If you're just worried about not
losing data, there are a lot of people with big clusters running 3x
replication and it's been fine.
If you have some use case where you think you're going to be turning
off a bunch of nodes simultaneously without planning, Ceph might not
be the storage system for your needs.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux