> > However I'm interested in the following > > On 11/16/20 11:31 AM, Janne Johansson wrote: > > So while one could always say "one more drive is better than your > > amount", there are people losing data with repl=2 or K+1 because some > > more normal operation was in flight and _then_ a single surprise > > happens. So you can have a weird reboot, causing those PGs needing > > backfill later, and if one of the uptodate hosts have any single > > surprise during the recovery, the cluster will lack some of the current > > data even if two disks were never down at the same time. > > I'm not sure I follow, from a logical perspective they *are* down at the > same time right? In your scenario 1 up-to-date replica was left, but > even that had a surprise. Okay well that's the risk you take with R=2, > but it's not intrinsically different than R=3. > I was trying to describe something like this http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-September/013237.html There are more posts from ceph consultants that get called in after someone with "only" R=2/EC K+1 seeing data loss, but I didn't dig them all up. Ie, a kind of split-brain scenario where a small fault/outage on one of the drives and later a bigger fault on another will hurt you in R=2 or K+1 scenarios, even if you don't have two full faults, only one that is temporarily out and the second being the one "real" fault which could be "disk died" or something which we usually imagine as the scenario to handle for raids or replication sizes. Not trying to say you don't understand this, but rather that people who run small ceph clusters tend to start out with R=2 or K+1 EC because the larger faults are easier to imagine. When you have R=3 and you move one of the 3 PG copies for disk resize or something, then you are temporarily reduced to two copies (at least two up-to-date copies if writes are happening during the move), so you can still bear one surprise until this is completed without losing data. With R=2/EC K+1 not so much. Also, me calling it small and large faults mean that there is a huge difference in "a few PGs with issues" and "disk completely broken", but if the PGs are for a pool with disk images, then all images on the pool are prone to have errors and not just "we lost 1% of the files, we can get only those back", but rather the disk images are all having random holes in them. -- May the most significant bit of your life be positive. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx