yep, my fault I meant replication = 3 .... > > but aren't PGs checksummed so from the remaining PG (given the > > checksum would be right) two new copies could be created? > > Assuming again 3R on 5 nodes, failure domain of host, if 2 nodes go down, there will be 1/3 copies available. Normally a 3R pool has min_size set to 2. > > You can set min_size to 1 temporarily, then those PGs will become active and copies will be created to restore redundancy, but if that remaining OSD is damaged, if there’s a DIMM flake, a cosmic ray, if the wrong OSD crashes or restarts at the wrong time, you can find yourself without the most recent copy of data and be unable to recover. It’s Russian Roulette. I see, but wouldn't ceph try to recreate redundancy by it's own (unless I'm explicitly tell it not to do so)? And if the I/O and load on the cluster isn't too high disk speed good net connectivity good it would recover fairly quickly into healthy redundancy state? Anyhow, I'm not planing on crashing two nodes ;-) I just wanted to get a feeling of how much more secure/robust a setup with five nodes compared to four nodes is. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx