How safe is ceph pg repair these days?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Well, that's the question...is that safe? Because the link to the
mailing list post (possibly outdated) says that what you just suggested
is definitely NOT safe. Is the mailing list post wrong? Has the
situation changed? Exactly what does ceph repair do now? I suppose I
could go dig into the code but I'm not an expert and would hate to get
it wrong and post possibly bogus info the the list for other newbies to
find and worry about and possibly lose their data.

On Fri, Feb 17, 2017 at 06:08:39PM PST, Shinobu Kinjo spake thusly:
> if ``ceph pg deep-scrub <pg id>`` does not work
> then
>   do
>     ``ceph pg repair <pg id>
> 
> 
> On Sat, Feb 18, 2017 at 10:02 AM, Tracy Reed <treed at ultraviolet.org> wrote:
> > I have a 3 replica cluster. A couple times I have run into inconsistent
> > PGs. I googled it and ceph docs and various blogs say run a repair
> > first. But a couple people on IRC and a mailing list thread from 2015
> > say that ceph blindly copies the primary over the secondaries and calls
> > it good.
> >
> > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-May/001370.html
> >
> > I sure hope that isn't the case. If so it would seem highly
> > irresponsible to implement such a naive command called "repair". I have
> > recently learned how to properly analyze the OSD logs and manually fix
> > these things but not before having run repair on a dozen inconsistent
> > PGs. Now I'm worried about what sort of corruption I may have
> > introduced. Repairing things by hand is a simple heuristic based on
> > comparing the size or checksum (as indicated by the logs) for each of
> > the 3 copies and figuring out which is correct. Presumably matching two
> > out of three should win and the odd object out should be deleted since
> > having the exact same kind of error on two different OSDs is highly
> > improbable. I don't understand why ceph repair wouldn't have done this
> > all along.
> >
> > What is the current best practice in the use of ceph repair?
> >
> > Thanks!
> >
> > --
> > Tracy Reed
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users at lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >

-- 
Tracy Reed
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20170217/ad07cfad/attachment.pgp>


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux