Thank you everyone for your replies. However, I feel that at least part of the discussion deviated from the topic of my original post. As I wrote before, I am dealing with a toy cluster, whose purpose is not to provide a resilient storage, but to evaluate ceph and its behavior in the event of a failure, with particular attention paid to worst-case scenarios. This cluster is purposely minimal, and is built on VMs running on my workstation, all OSDs storing data on a single SSD. That's definitely not a production system. I am not asking for advice on how to build resilient clusters, not at this point. I asked some questions about specific things that I noticed during my tests, and that I was not able to find explained in ceph documentation. Dan van der Ster wrote: > See https://github.com/ceph/ceph/pull/8008 for the reason why min_size defaults to k+1 on ec pools. That's a good point, but I am wondering why are reads also blocked when number of OSDs falls down to k? What if total number of OSDs in a pool (n) is larger than k+m, should the min_size then be k(+1) or n-m(+1)? In any case, since min_size can be easily changed, then I guess this is not an implementation issue, but rather a documentation issue. Which leaves these my questions still unanswered: After killing m OSDs and setting min_size=k most of PGs were now active+undersized, often with ...+degraded and/or remapped, but a few were active+clean or active+clean+remapped. Why? I would expect all PGs to be in the same state (perhaps active+undersized+degraded?). Is this mishmash of PG states normal? If not, would I have avoided it if I created the pool with min_size=k=3 from the start? In other words, does min_size influence the assignment of PGs to OSDs? Or is it only used to force I/O shutdown in the event of OSDs failures? Thank you very much Maciej Puzio _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com