>From what I understand in Jewel+ Ceph has the concept of an authorative shard, so in the case of a 3x replica pools, it will notice that 2 replicas match and one doesn't and use one of the good replicas. However, in a 2x pool your out of luck. However, if someone could confirm my suspicions that would be good as well. > -----Original Message----- > From: ceph-users [mailto:ceph-users-bounces at lists.ceph.com] On Behalf Of > Tracy Reed > Sent: 18 February 2017 03:06 > To: Shinobu Kinjo <skinjo at redhat.com> > Cc: ceph-users <ceph-users at ceph.com> > Subject: Re: [ceph-users] How safe is ceph pg repair these days? > > Well, that's the question...is that safe? Because the link to the mailing list > post (possibly outdated) says that what you just suggested is definitely NOT > safe. Is the mailing list post wrong? Has the situation changed? Exactly what > does ceph repair do now? I suppose I could go dig into the code but I'm not > an expert and would hate to get it wrong and post possibly bogus info the > the list for other newbies to find and worry about and possibly lose their > data. > > On Fri, Feb 17, 2017 at 06:08:39PM PST, Shinobu Kinjo spake thusly: > > if ``ceph pg deep-scrub <pg id>`` does not work then > > do > > ``ceph pg repair <pg id> > > > > > > On Sat, Feb 18, 2017 at 10:02 AM, Tracy Reed <treed at ultraviolet.org> > wrote: > > > I have a 3 replica cluster. A couple times I have run into > > > inconsistent PGs. I googled it and ceph docs and various blogs say > > > run a repair first. But a couple people on IRC and a mailing list > > > thread from 2015 say that ceph blindly copies the primary over the > > > secondaries and calls it good. > > > > > > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015- > May/001370. > > > html > > > > > > I sure hope that isn't the case. If so it would seem highly > > > irresponsible to implement such a naive command called "repair". I > > > have recently learned how to properly analyze the OSD logs and > > > manually fix these things but not before having run repair on a > > > dozen inconsistent PGs. Now I'm worried about what sort of > > > corruption I may have introduced. Repairing things by hand is a > > > simple heuristic based on comparing the size or checksum (as > > > indicated by the logs) for each of the 3 copies and figuring out > > > which is correct. Presumably matching two out of three should win > > > and the odd object out should be deleted since having the exact same > > > kind of error on two different OSDs is highly improbable. I don't > > > understand why ceph repair wouldn't have done this all along. > > > > > > What is the current best practice in the use of ceph repair? > > > > > > Thanks! > > > > > > -- > > > Tracy Reed > > > > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users at lists.ceph.com > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > -- > Tracy Reed