Here it is: http://pastebin.com/HfUPDTK4 Someone asked:
I am still begineer with Ceph, but as far as I understood, ceph is not designed to lose the 33% of the cluster at once and recover rapidly. What I understand is that you are losing 33% of
the cluster losing 1 rack out of 3. It will take a very long time to recover, before you have HEALTH_OK status. can you check with ceph -w how long it takes for ceph to converge to a healthy cluster after you switch off the switch in Rack-A ? If I have a replica of each object in the other remaining racks (due to the crush map thingy), why should this impact my platform?
From: Andrey Korolyov [mailto:andrey@xxxxxxx]
> The question is: is this behavior indeed expected? The answer can be positive if you are using large number of placement groups, 16k is indeed a large one. The peering may take a long time, blocking I/O requests effectively during this period. Do you have a ceph -w log
during this transition to share? Kind regards,
|
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com