Re: unable to repair PG

Gregory Farnum <greg@xxxxxxxxxxx> · Fri, 12 Dec 2014 11:20:27 -0800

What version of Ceph are you running? Is this a replicated or
erasure-coded pool?

On Fri, Dec 12, 2014 at 1:11 AM, Luis Periquito <periquito@xxxxxxxxx> wrote:
> Hi Greg,
>
> thanks for your help. It's always highly appreciated. :)
>
> On Thu, Dec 11, 2014 at 6:41 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>>
>> On Thu, Dec 11, 2014 at 2:57 AM, Luis Periquito <periquito@xxxxxxxxx>
>> wrote:
>> > Hi,
>> >
>> > I've stopped OSD.16, removed the PG from the local filesystem and
>> > started
>> > the OSD again. After ceph rebuilt the PG in the removed OSD I ran a
>> > deep-scrub and the PG is still inconsistent.
>>
>> What led you to remove it from osd 16? Is that the one hosting the log
>> you snipped from? Is osd 16 the one hosting shard 6 of that PG, or was
>> it the primary?
>
> OSD 16 is both the primary for this PG and the one that has the snipped log.
> The other 3 OSDs has any mention of this PG in their logs. Just some
> messages about slow requests and the backfill when I removed the object.
> Actually it came from OSD.6 - currently we don't have OSD.3.
>
> this is the output of the pg dump for this PG
> 9.180    25614    0    0    0    23306482348    3001    3001
> active+clean+inconsistent    2014-12-10 17:29:01.937929    40242'1108124
> 40242:23305321    [16,10,27,6]    16    [16,10,27,6]16    40242'1071363
> 2014-12-10 17:29:01.937881    40242'1071363    2014-12-10 17:29:01.937881
>
>>
>> Anyway, the message means that shard 6 (which I think is the seventh
>> OSD in the list) of PG 9.180 is missing a bunch of xattrs on object
>> 370cbf80/29145.4_xxx/head//9. I'm actually a little surprised it
>> didn't crash if it's missing the "_" attr....
>> -Greg
>
>
> Any idea on how to fix it?
>
>>
>>
>> >
>> > I'm running out of ideas on trying to solve this. Does this mean that
>> > all
>> > copies of the object should also be inconsistent? Should I just try to
>> > figure which object/bucket this belongs to and delete it/copy it again
>> > to
>> > the ceph cluster?
>> >
>> > Also, do you know what the error message means? is it just some sort of
>> > metadata for this object that isn't correct, not the object itself?
>> >
>> > On Wed, Dec 10, 2014 at 11:11 AM, Luis Periquito <periquito@xxxxxxxxx>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> In the last few days this PG (pool is .rgw.buckets) has been in error
>> >> after running the scrub process.
>> >>
>> >> After getting the error, and trying to see what may be the issue (and
>> >> finding none), I've just issued a ceph repair followed by a ceph
>> >> deep-scrub.
>> >> However it doesn't seem to have fixed the issue and it still remains.
>> >>
>> >> The relevant log from the OSD is as follows.
>> >>
>> >> 2014-12-10 09:38:09.348110 7f8f618be700  0 log [ERR] : 9.180 deep-scrub
>> >> 0
>> >> missing, 1 inconsistent objects
>> >> 2014-12-10 09:38:09.348116 7f8f618be700  0 log [ERR] : 9.180 deep-scrub
>> >> 1
>> >> errors
>> >> 2014-12-10 10:13:15.922065 7f8f618be700  0 log [INF] : 9.180 repair ok,
>> >> 0
>> >> fixed
>> >> 2014-12-10 10:55:27.556358 7f8f618be700  0 log [ERR] : 9.180 shard 6:
>> >> soid
>> >> 370cbf80/29145.4_xxx/head//9 missing attr _, missing attr
>> >> _user.rgw.acl,
>> >> missing attr _user.rgw.content_type, missing attr _user.rgw.etag,
>> >> missing
>> >> attr _user.rgw.idtag, missing attr _user.rgw.manifest, missing attr
>> >> _user.rgw.x-amz-meta-md5sum, missing attr _user.rgw.x-amz-meta-stat,
>> >> missing
>> >> attr snapset
>> >> 2014-12-10 10:56:50.597952 7f8f618be700  0 log [ERR] : 9.180 deep-scrub
>> >> 0
>> >> missing, 1 inconsistent objects
>> >> 2014-12-10 10:56:50.597957 7f8f618be700  0 log [ERR] : 9.180 deep-scrub
>> >> 1
>> >> errors
>> >>
>> >> I'm running version firefly 0.80.7.
>> >
>> >
>> >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@xxxxxxxxxxxxxx
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com