how to debug pg inconsistent state - no ioerrors seen

Kenneth Waegeman <kenneth.waegeman@xxxxxxxx> · Mon, 8 Aug 2016 13:40:53 +0200

Hi all,

Since last week, some pg's are going in the inconsistent state after a 
scrub error. Last week we had 4 pgs in that state, They were on 
different OSDS, but all of the metadata pool.
I did a pg repair on them, and all were healthy again. But now again one 
pg is inconsistent.

with health detail I see:

pg 6.2f4 is active+clean+inconsistent, acting [3,5,1]
1 scrub errors

And in the log of the primary:

2016-08-06 06:24:44.723224 7fc5493f3700 -1 log_channel(cluster) log 
[ERR] : 6.2f4 shard 5: soid 6:2f55791f:::606.00000000:head omap_digest 
0x3a105358 != best guess omap_digest 0xc85c4361 from auth shard 1
2016-08-06 06:24:53.931029 7fc54bbf8700 -1 log_channel(cluster) log 
[ERR] : 6.2f4 deep-scrub 0 missing, 1 inconsistent objects
2016-08-06 06:24:53.931055 7fc54bbf8700 -1 log_channel(cluster) log 
[ERR] : 6.2f4 deep-scrub 1 errors

I looked in dmesg but I couldn't see any IO errors on any of the OSDs in 
the acting set.  Last week it was another set. It is of course possible 
more than 1 OSD is failing, but how can we check this, since there is 
nothing more in the logs?

Thanks !!

K
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com