On 10/24/2015 09:41 AM, Stefan Eriksson wrote: >> Am 23.10.2015 um 20:53 schrieb Gregory Farnum: >>> On Fri, Oct 23, 2015 at 8:17 AM, Stefan Eriksson <stefan@xxxxxxxxxxx> > wrote: >>> >>> Nothing changed to make two copies less secure. 3 copies is just so >>> much more secure and is the number that all the companies providing >>> support recommend, so we changed the default. >>> (If you're using it for data you care about, you should really use 3 > copies!) >>> -Greg >> >> I assume that number really depends on the (number of) OSDs you have in > your crush rule for that pool. A replication of >> 2 might be ok for a pool spread over 10 osds, but not for one spread over > 100 osds.... >> >> Corin >> > > I'm also interested in this, what changes when you add 100+ OSDs (to > warrant 3 replicas instead of 2), and the reasoning as to why "the > companies providing support recommend 3." ? > Theoretically it seems secure to have two replicas. > If you have 100+ OSDs, I can see that maintenance will take much longer, > and if you use "set noout" then a single PG will be active when the other > replica is under maintenance. > But if you "crush reweight to 0" before the maintenance this would not be > an issue. > Is this the main reason? > > From what I can gather even if you add new OSDs to the cluster and the > balancing kicks in, it still maintains its two replicas. > No, the danger is that your only remaining replica dies during recovery. I've seen this happen twice in two different clusters last week. In one cluster a 3TB drive died and while recovering a 2nd 3TB drive died which caused some PGs to go to 'undersized'. min_size was set 2 to. In another cluster a 1TB SSD died and while we were recovering from that failure, another SSD failed causing the same situation as described above. IIRC the guys at Cern even run with 4 replicas since they even don't think 3 is safe. 2 replicas isn't safe, no matter how big or small the cluster is. With disks becoming larger recovery times will grow. In that window you don't want to run on a single replica. > thanks. > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com