Re: inconsistent pgs

Sage Weil <sage@xxxxxxxxxxxx> · Sun, 9 Aug 2015 07:23:51 -0700 (PDT)

On Sat, 8 Aug 2015, ?????????? ??????? wrote:
> Hi!
> 
> I have a large number of inconsistent pgs 229 of 656, and it's
> increasing every hour.
> I'm using ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3).
> 
> For example, pg 3.d8:
> # ceph health detail | grep 3.d8
> pg 3.d8 is active+clean+scrubbing+deep+inconsistent, acting [1,7]
> 
> # grep 3.d8 /var/log/ceph/ceph-osd.1.log | less -S
> 2015-08-07 13:10:48.311810 7f5903f7a700 0 log_channel(cluster) log
> [INF] : 3.d8 repair starts 2015-08-07 13:12:05.703084 7f5903f7a700 -1
> log_channel(cluster) log [ERR] : repair 3.d8
> cbd2d0d8/rbd_data.6a5cf474b0dc51.0000000000000b1f/head//3 on disk data
> digest 0x6e4d80bf != 0x6fb5b103 2015-08-07 13:13:26.837524
> 7f5903f7a700 -1 log_channel(cluster) log [ERR] : repair 3.d8
> b5892d8/rbd_data.dbe674b0dc51.00000000000001b9/head//3 on disk data
> digest 0x79082779 != 0x9f102f3d 2015-08-07 13:13:44.874725
> 7f5903f7a700 -1 log_channel(cluster) log [ERR] : repair 3.d8
> ee6dc2d8/rbd_data.e7592ae8944a.0000000000000833/head//3 on disk data
> digest 0x63ab49d0 != 0x68778496 2015-08-07 13:14:19.378582
> 7f5903f7a700 -1 log_channel(cluster) log [ERR] : repair 3.d8
> d93e14d8/rbd_data.3ef8442ae8944a.0000000000000729/head//3 on disk data
> digest 0x3cdb1f5c != 0x4e0400c2 2015-08-07 13:23:38.668080
> 7f5903f7a700 -1 log_channel(cluster) log [ERR] : 3.d8 repair 4 errors,
> 0 fixed 2015-08-07 13:23:38.714668 7f5903f7a700 0 log_channel(cluster)
> log [INF] : 3.d8 deep-scrub starts 2015-08-07 13:25:00.656306
> 7f5903f7a700 -1 log_channel(cluster) log [ERR] : deep-scrub 3.d8
> cbd2d0d8/rbd_data.6a5cf474b0dc51.0000000000000b1f/head//3 on disk data
> digest 0x6e4d80bf != 0x6fb5b103 2015-08-07 13:26:18.775362
> 7f5903f7a700 -1 log_channel(cluster) log [ERR] : deep-scrub 3.d8
> b5892d8/rbd_data.dbe674b0dc51.00000000000001b9/head//3 on disk data
> digest 0x79082779 != 0x9f102f3d 2015-08-07 13:26:42.084218
> 7f5903f7a700 -1 log_channel(cluster) log [ERR] : deep-scrub 3.d8
> ee6dc2d8/rbd_data.e7592ae8944a.0000000000000833/head//3 on disk data
> digest 0x59a6e7e0 != 0x68778496 2015-08-07 13:26:56.495207

This indicates the stored crc doesn't match the observed crc, 
and

> 7f5903f7a700 -1 log_channel(cluster) log [ERR] : be_compare_scrubmaps:
> 3.d8 shard 1: soid
> cc49f2d8/rbd_data.3ef8442ae8944a.0000000000000aff/head//3 data_digest
> 0x4e20a792 != known data_digest 0xc0e9b2d2 from auth shard 7

this indicates two replicas do not match.

> 2015-08-07 13:27:12.134765 7f5903f7a700 -1 log_channel(cluster) log
> [ERR] : deep-scrub 3.d8
> d93e14d8/rbd_data.3ef8442ae8944a.0000000000000729/head//3 on disk data
> digest 0x3cdb1f5c != 0x4e0400c2
> 
> osd.7.log is clean for that period of time.
> /var/log/dmesg is also clean.

This really shouldn't happen, but there has been one recently fixed bug 
that could have corrupted a replica.  Can you locate two mismatched copies 
of some object on two different OSDs, and use ceph-post-file so that we 
can take a look at the actual corruption?  For this PG, for example, the 
mismatched copies are on osd.1 and osd.7.  On those hosts, you can find 
the backing file with

 find /var/lib/ceph/osd/ceph-1/current/3.d8_head | grep rbd_data.3ef8442ae8944a.0000000000000aff

Alternatively, if the data is sensitive, can you diff the hexdump -C 
output of both files, see what the differing bytes look like, and describe 
that to us?

Thanks!
sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html