Hi, Is anybody using 4x (size=4, min_size=2) replication with Ceph? The reason I'm asking is that a customer of mine asked me for a solution to prevent a situation which occurred: A cluster running with size=3 and replication over different racks was being upgraded from 13.2.5 to 13.2.6. During the upgrade, which involved patching the OS as well, they rebooted one of the nodes. During that reboot suddenly a node in a different rack rebooted. It was unclear why this happened, but the node was gone. While the upgraded node was rebooting and the other node crashed about 120 PGs were inactive due to min_size=2 Waiting for the nodes to come back, recovery to finish it took about 15 minutes before all VMs running inside OpenStack were back again. As you are upgraded or performing any maintenance with size=3 you can't tolerate a failure of a node as that will cause PGs to go inactive. This made me think about using size=4 and min_size=2 to prevent this situation. This obviously has implications on write latency and cost, but it would prevent such a situation. Is anybody here running a Ceph cluster with size=4 and min_size=2 for this reason? Thank you, Wido _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com