Hello, First of all, I would recommend, that you use ceph pg repair wherever you can.
When you have size=3 the cluster can compare 3 instances, therefore it is easier for it to spot which two is good, and which one is bad. When you use size=2 the case is harder for o-so-many ways: -According to the documentation it is harder to determine which object is the faulty. -If an osd dies the increased load (caused by the missing osd) and the extra io from the recovery process hits the other osd much harder, increasing the chance that another osd dies (because of disk failure caused by the sudden spike of extra load), and then you loose your data -If there is a bitrot in the remaining one replica, then you do not have any valid copies for your data So, to summarize it, the experts say, that it is MUCH safer to
have size=3 min_size=2 (I'm far from an expert, I'm just quoting
:))
So, back to the task at hand: If you repaired all pgs that you coud by ceph pg repair, there is a manual recovery process, (written for filestore unfortunately): http://ceph.com/geen-categorie/ceph-manually-repair-object/ The good news is, that there is a fuse client for bluestore too, so you can mount it by hand and repair it as per the linked document, I think that you could ceph osd pool set [pool] size 3 yo increase the copy count, but before that you should be certain that you have enough free space, and you'll not hit the osd pg count limits. [DISCLAIMER]: I have never done this, and I too have questions about this topic: [Questions to the list] How is it possible that the cluster cannot repair itself with ceph pg repair? No good copies are remaining? Cannot decide which copy is valid or up-to date? If so, why not, when there is checksum, mtime for everything? In this inconsistent state which object does the cluster serve when it doesn't know which one is the valid?
Isn't there a way to do a more "online" repair? A way to examine, remove objects while running the osd? Or better yet, to tell the cluster that which copy should be used when repairing? There is a command, ceph pg force-recovery, but I cannot find
documentation for it.
Kind regards, Denes Dolhay.
On 10/28/2017 01:05 PM, Mario Giammarco
wrote:
|
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com