On Tue, Sep 16, 2014 at 5:10 PM, JIten Shah <jshah2005 at me.com> wrote: > Hi Guys, > > We have a cluster with 1000 OSD nodes and 5 MON nodes and 1 MDS node. In order to be able to loose quite a few OSD?s and still survive the load, we were thinking of making the replication factor to 50. > > Is that too big of a number? what is the performance implications and any other issues that we should consider before setting it to that. Also, do we need the same number of metadata copies too or it can be less? Don't do that. Every write has to be synchronously copied to every replica, so 50x replication will give you very high latencies and very low write bandwidth to each object. If you're just worried about not losing data, there are a lot of people with big clusters running 3x replication and it's been fine. If you have some use case where you think you're going to be turning off a bunch of nodes simultaneously without planning, Ceph might not be the storage system for your needs. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com