Reading about min_size behavior, I understand next: if I set min_size=1 - it is for reading & writing, so it is dangerous. Can I (via crush map) or developers (via code changes) to separate min_size behavior for reading & writing? I event trying to look into code, but IMHO developers can solve all more easy. Thinking about this, I prefer to see next behaviour (for example): on write, if PG number < size, but >= min_size - before writing heal this PG to consistent state. It can contains from this steps (I don't understand now code structure, but have probably ideas about tasks separating): 1) On repair (healing) proceed write pending PGs first (this is separated good idea - IMHO) - to minimize client freezing; 2) Minimize period for this write pending PGs before healing (same); 3) Make min_size [optional] working only for read. Or simple always clone ("heal") inconsistent PGs to "size" in write task (if code enables it). So write requests will be always protected from data loss (of course, still possibility to invert written and offline OSDs in one pass in large cluster, but this is minimal care for mind about min_size). -- WBR, Dzianis Kahanovich AKA Denis Kaganovich, http://mahatma.bspu.unibel.by/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com