scrub error on firefly

sam.just@xxxxxxxxxxx (Samuel Just) · Fri, 11 Jul 2014 10:32:29 -0700



When you get the next inconsistency, can you copy the actual objects
from the osd store trees and get them to us?  That might provide a
clue.
-Sam

On Fri, Jul 11, 2014 at 6:52 AM, Randy Smith <rbsmith at adams.edu> wrote:
>
>
>
> On Thu, Jul 10, 2014 at 4:40 PM, Samuel Just <sam.just at inktank.com> wrote:
>>
>> It could be an indication of a problem on osd 5, but the timing is
>> worrying.  Can you attach your ceph.conf?
>
>
> Attached.
>
>>
>> Have there been any osds
>> going down, new osds added, anything to cause recovery?
>
>
> I upgraded to firefly last week. As part of the upgrade I, obviously, had to
> restart every osd. Also, I attempted to switch to the optimal tunables but
> doing so degraded 27% of my cluster and made most of my VMs unresponsive. I
> switched back to the legacy tunables and everything was happy again. Both of
> those operations, of course, caused recoveries. I have made no changes since
> then.
>
>>
>>  Anything in
>> dmesg to indicate an fs problem?
>
>
> Nothing. The system went inconsistent again this morning, again on the same
> rbd but different osds this time.
>
> 2014-07-11 05:48:12.857657 osd.1 192.168.253.77:6801/12608 904 : [ERR] 3.76
> shard 1: soid 1280076/rb.0.b0ce3.238e1f29.00000000025c/head//3 digest
> 2198242284 != known digest 3879754377
> 2014-07-11 05:49:29.020024 osd.1 192.168.253.77:6801/12608 905 : [ERR] 3.76
> deep-scrub 0 missing, 1 inconsistent objects
> 2014-07-11 05:49:29.020029 osd.1 192.168.253.77:6801/12608 906 : [ERR] 3.76
> deep-scrub 1 errors
>
> $ ceph health detail
> HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
> pg 3.76 is active+clean+inconsistent, acting [1,2]
> 1 scrub errors
>
>
>>
>>  Have you recently changed any
>> settings?
>
>
> I upgraded from bobtail to dumpling to firefly.
>
>>
>> -Sam
>>
>> On Thu, Jul 10, 2014 at 2:58 PM, Randy Smith <rbsmith at adams.edu> wrote:
>> > Greetings,
>> >
>> > Just a follow up on my original issue. =ceph pg repair ...= fixed the
>> > problem. However, today I got another inconsistent pg. It's interesting
>> > to
>> > me that this second error is in the same rbd image and appears to be
>> > "close"
>> > to the previously inconsistent pg. (Even more fun, osd.5 was the
>> > secondary
>> > in the first error and is the primary here though the other osd is
>> > different.)
>> >
>> > Is this indicative of a problem on osd.5 or perhaps a clue into what's
>> > causing firefly to be so inconsistent?
>> >
>> > The relevant log entries are below.
>> >
>> > 2014-07-07 18:50:48.646407 osd.2 192.168.253.70:6801/56987 163 : [ERR]
>> > 3.c6
>> > shard 2: soid 34dc35c6/rb.0.b0ce3.238e1f29.00000000000b/head//3 digest
>> > 2256074002 != known digest 3998068918
>> > 2014-07-07 18:51:36.936076 osd.2 192.168.253.70:6801/56987 164 : [ERR]
>> > 3.c6
>> > deep-scrub 0 missing, 1 inconsistent objects
>> > 2014-07-07 18:51:36.936082 osd.2 192.168.253.70:6801/56987 165 : [ERR]
>> > 3.c6
>> > deep-scrub 1 errors
>> >
>> >
>> > 2014-07-10 15:38:53.990328 osd.5 192.168.253.81:6800/10013 257 : [ERR]
>> > 3.41
>> > shard 1: soid e183cc41/rb.0.b0ce3.238e1f29.00000000024c/head//3 digest
>> > 3224286363 != known digest 3409342281
>> > 2014-07-10 15:39:11.701276 osd.5 192.168.253.81:6800/10013 258 : [ERR]
>> > 3.41
>> > deep-scrub 0 missing, 1 inconsistent objects
>> > 2014-07-10 15:39:11.701281 osd.5 192.168.253.81:6800/10013 259 : [ERR]
>> > 3.41
>> > deep-scrub 1 errors
>> >
>> >
>> >
>> > On Thu, Jul 10, 2014 at 12:05 PM, Chahal, Sudip <sudip.chahal at intel.com>
>> > wrote:
>> >>
>> >> Thanks - so it appears that the advantage of the 3rd replica (relative
>> >> to
>> >> 2 replicas) has to do much more with recovering from two concurrent OSD
>> >> failures than with inconsistencies found during deep scrub - would you
>> >> agree?
>> >>
>> >> Re: repair - do you mean the "repair" process during deep scrub  - if
>> >> yes,
>> >> this is automatic - correct?
>> >>     Or
>> >> Are you referring to the explicit manually initiated repair commands?
>> >>
>> >> Thanks,
>> >>
>> >> -Sudip
>> >>
>> >> -----Original Message-----
>> >> From: Samuel Just [mailto:sam.just at inktank.com]
>> >> Sent: Thursday, July 10, 2014 10:50 AM
>> >> To: Chahal, Sudip
>> >> Cc: Christian Eichelmann; ceph-users at lists.ceph.com
>> >> Subject: Re: [ceph-users] scrub error on firefly
>> >>
>> >> Repair I think will tend to choose the copy with the lowest osd number
>> >> which is not obviously corrupted.  Even with three replicas, it does
>> >> not do
>> >> any kind of voting at this time.
>> >> -Sam
>> >>
>> >> On Thu, Jul 10, 2014 at 10:39 AM, Chahal, Sudip
>> >> <sudip.chahal at intel.com>
>> >> wrote:
>> >> > I've a basic related question re: Firefly operation - would
>> >> > appreciate
>> >> > any insights:
>> >> >
>> >> > With three replicas, if checksum inconsistencies across replicas are
>> >> > found during deep-scrub then:
>> >> >         a.  does the majority win or is the primary always the winner
>> >> > and used to overwrite the secondaries
>> >> >                 b. is this reconciliation done automatically during
>> >> > deep-scrub or does each reconciliation have to be executed manually
>> >> > by the
>> >> > administrator?
>> >> >
>> >> > With 2 replicas - how are things different (if at all):
>> >> >                a. The primary is declared the winner - correct?
>> >> >                b. is this reconciliation done automatically during
>> >> > deep-scrub or does it have to be done "manually" because there is no
>> >> > majority?
>> >> >
>> >> > Thanks,
>> >> >
>> >> > -Sudip
>> >> >
>> >> >
>> >> > -----Original Message-----
>> >> > From: ceph-users [mailto:ceph-users-bounces at lists.ceph.com] On Behalf
>> >> > Of Samuel Just
>> >> > Sent: Thursday, July 10, 2014 10:16 AM
>> >> > To: Christian Eichelmann
>> >> > Cc: ceph-users at lists.ceph.com
>> >> > Subject: Re: [ceph-users] scrub error on firefly
>> >> >
>> >> > Can you attach your ceph.conf for your osds?
>> >> > -Sam
>> >> >
>> >> > On Thu, Jul 10, 2014 at 8:01 AM, Christian Eichelmann
>> >> > <christian.eichelmann at 1und1.de> wrote:
>> >> >> I can also confirm that after upgrading to firefly both of our
>> >> >> clusters (test and live) were going from 0 scrub errors each for
>> >> >> about
>> >> >> 6 Month to about 9-12 per week...
>> >> >> This also makes me kind of nervous, since as far as I know
>> >> >> everything
>> >> >> "ceph pg repair" does, is to copy the primary object to all
>> >> >> replicas,
>> >> >> no matter which object is the correct one.
>> >> >> Of course the described method of manual checking works (for pools
>> >> >> with more than 2 replicas), but doing this in a large cluster nearly
>> >> >> every week is horribly timeconsuming and error prone.
>> >> >> It would be great to get an explanation for the increased numbers of
>> >> >> scrub errors since firefly. Were they just not detected correctly in
>> >> >> previous versions? Or is there maybe something wrong with the new
>> >> >> code?
>> >> >>
>> >> >> Acutally, our company is currently preventing our projects to move
>> >> >> to
>> >> >> ceph because of this problem.
>> >> >>
>> >> >> Regards,
>> >> >> Christian
>> >> >> ________________________________
>> >> >> Von: ceph-users [ceph-users-bounces at lists.ceph.com]" im Auftrag von
>> >> >> "Travis Rhoden [trhoden at gmail.com]
>> >> >> Gesendet: Donnerstag, 10. Juli 2014 16:24
>> >> >> An: Gregory Farnum
>> >> >> Cc: ceph-users at lists.ceph.com
>> >> >> Betreff: Re: [ceph-users] scrub error on firefly
>> >> >>
>> >> >> And actually just to follow-up, it does seem like there are some
>> >> >> additional smarts beyond just using the primary to overwrite the
>> >> >> secondaries...  Since I captured md5 sums before and after the
>> >> >> repair, I can say that in this particular instance, the secondary
>> >> >> copy
>> >> >> was used to overwrite the primary.
>> >> >> So, I'm just trusting Ceph to the right thing, and so far it seems
>> >> >> to, but the comments here about needing to determine the correct
>> >> >> object and place it on the primary PG make me wonder if I've been
>> >> >> missing something.
>> >> >>
>> >> >>  - Travis
>> >> >>
>> >> >>
>> >> >> On Thu, Jul 10, 2014 at 10:19 AM, Travis Rhoden <trhoden at gmail.com>
>> >> >> wrote:
>> >> >>>
>> >> >>> I can also say that after a recent upgrade to Firefly, I have
>> >> >>> experienced massive uptick in scrub errors.  The cluster was on
>> >> >>> cuttlefish for about a year, and had maybe one or two scrub errors.
>> >> >>> After upgrading to Firefly, we've probably seen 3 to 4 dozen in the
>> >> >>> last month or so (was getting 2-3 a day for a few weeks until the
>> >> >>> whole cluster was rescrubbed, it seemed).
>> >> >>>
>> >> >>> What I cannot determine, however, is how to know which object is
>> >> >>> busted?
>> >> >>> For example, just today I ran into a scrub error.  The object has
>> >> >>> two copies and is an 8MB piece of an RBD, and has identical
>> >> >>> timestamps, identical xattrs names and values.  But it definitely
>> >> >>> has a different
>> >> >>> MD5 sum. How to know which one is correct?
>> >> >>>
>> >> >>> I've been just kicking off pg repair each time, which seems to just
>> >> >>> use the primary copy to overwrite the others.  Haven't run into any
>> >> >>> issues with that so far, but it does make me nervous.
>> >> >>>
>> >> >>>  - Travis
>> >> >>>
>> >> >>>
>> >> >>> On Tue, Jul 8, 2014 at 1:06 AM, Gregory Farnum <greg at inktank.com>
>> >> >>> wrote:
>> >> >>>>
>> >> >>>> It's not very intuitive or easy to look at right now (there are
>> >> >>>> plans from the recent developer summit to improve things), but the
>> >> >>>> central log should have output about exactly what objects are
>> >> >>>> busted. You'll then want to compare the copies manually to
>> >> >>>> determine which ones are good or bad, get the good copy on the
>> >> >>>> primary (make sure you preserve xattrs), and run repair.
>> >> >>>> -Greg
>> >> >>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>> >> >>>>
>> >> >>>>
>> >> >>>> On Mon, Jul 7, 2014 at 6:48 PM, Randy Smith <rbsmith at adams.edu>
>> >> >>>> wrote:
>> >> >>>> > Greetings,
>> >> >>>> >
>> >> >>>> > I upgraded to firefly last week and I suddenly received this
>> >> >>>> > error:
>> >> >>>> >
>> >> >>>> > health HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
>> >> >>>> >
>> >> >>>> > ceph health detail shows the following:
>> >> >>>> >
>> >> >>>> > HEALTH_ERR 1 pgs inconsistent; 1 scrub errors pg 3.c6 is
>> >> >>>> > active+clean+inconsistent, acting [2,5]
>> >> >>>> > 1 scrub errors
>> >> >>>> >
>> >> >>>> > The docs say that I can run `ceph pg repair 3.c6` to fix this.
>> >> >>>> > What I want to know is what are the risks of data loss if I run
>> >> >>>> > that command in this state and how can I mitigate them?
>> >> >>>> >
>> >> >>>> > --
>> >> >>>> > Randall Smith
>> >> >>>> > Computing Services
>> >> >>>> > Adams State University
>> >> >>>> > http://www.adams.edu/
>> >> >>>> > 719-587-7741
>> >> >>>> >
>> >> >>>> > _______________________________________________
>> >> >>>> > ceph-users mailing list
>> >> >>>> > ceph-users at lists.ceph.com
>> >> >>>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> >>>> >
>> >> >>>> _______________________________________________
>> >> >>>> ceph-users mailing list
>> >> >>>> ceph-users at lists.ceph.com
>> >> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> >>>
>> >> >>>
>> >> >>
>> >> >>
>> >> >> _______________________________________________
>> >> >> ceph-users mailing list
>> >> >> ceph-users at lists.ceph.com
>> >> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> >>
>> >> > _______________________________________________
>> >> > ceph-users mailing list
>> >> > ceph-users at lists.ceph.com
>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> _______________________________________________
>> >> ceph-users mailing list
>> >> ceph-users at lists.ceph.com
>> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> >
>> >
>> >
>> > --
>> > Randall Smith
>> > Computing Services
>> > Adams State University
>> > http://www.adams.edu/
>> > 719-587-7741
>> >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users at lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>
>
>
>
> --
> Randall Smith
> Computing Services
> Adams State University
> http://www.adams.edu/
> 719-587-7741