Is redundancy across failure domains guaranteed or best effort? Note: The best answer to the questions below is obviously to avoid the situation by properly weight drives and not approaching the full ratio. I'm just curious how CEPH works. Hypothetical situation: Say you have 1 pool of size=3 and 3 servers, each with 2 OSDs. Say you weighted the OSDs poorly such that the OSDs on one server filled up but the OSDs on the others still had space. CEPH could still store 3 replicas of your data, but two of them would be on the same server. What happens? (select all that apply) a.[ ] Clients can still read data b.[ ] Clients can still write data c.[ ] health = HEALTH_WARN d.[ ] health = HEALTH_OK e.[ ] PGs are degraded f.[ ] ceph stores only two copies of data g.[ ] ceph stores 3 copies of data, two of which are on the same server h.[ ] something else? If the answer is "best effort" (a+b+d+g), how would you detect if that scenario is occurring? If the answer is "guaranteed" (f+e+c+...) and you loose a drive while in that scenario, is there any way to tell CEPH to store temporarily store 2 copies on a single server just in case? I suspect the answer is to remove host bucket from the crushmap but that that's a really bad idea because it would trigger a rebuild and the extra disk activity increases the likelihood of additional drive failures, correct? -- Adam Carheden _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com