Ceph pg active+clean+inconsistent

Andras Pataki <apataki@xxxxxxxxxxxxxxxxxxxx> · Thu, 15 Dec 2016 10:13:27 -0500

Hi everyone,

Yesterday scrubbing turned up an inconsistency in one of our placement 
groups.  We are running ceph 10.2.3, using CephFS and RBD for some VM 
images.

[root@hyperv017 ~]# ceph -s
    cluster d7b33135-0940-4e48-8aa6-1d2026597c2f
     health HEALTH_ERR
            1 pgs inconsistent
            1 scrub errors
            noout flag(s) set
     monmap e15: 3 mons at 
{hyperv029=10.4.36.179:6789/0,hyperv030=10.4.36.180:6789/0,hyperv031=10.4.36.181:6789/0}
            election epoch 27192, quorum 0,1,2 
hyperv029,hyperv030,hyperv031
      fsmap e17181: 1/1/1 up {0=hyperv029=up:active}, 2 up:standby
     osdmap e342930: 385 osds: 385 up, 385 in
            flags noout
      pgmap v37580512: 34816 pgs, 5 pools, 673 TB data, 198 Mobjects
            1583 TB used, 840 TB / 2423 TB avail
               34809 active+clean
                   4 active+clean+scrubbing+deep
                   2 active+clean+scrubbing
                   1 active+clean+inconsistent
  client io 87543 kB/s rd, 671 MB/s wr, 23 op/s rd, 2846 op/s wr

# ceph pg dump | grep inconsistent
6.13f1  4692    0       0       0       0 16057314767     3087    
3087    active+clean+inconsistent 2016-12-14 16:49:48.391572      
342929'41011    342929:43966 [158,215,364]   158     [158,215,364]   
158     342928'40540 2016-12-14 16:49:48.391511      342928'40540    
2016-12-14 16:49:48.391511

I tried a couple of other deep scrubs on pg 6.13f1 but got repeated 
errors.  In the OSD logs:

2016-12-14 16:48:07.733291 7f3b56e3a700 -1 log_channel(cluster) log 
[ERR] : deep-scrub 6.13f1 6:8fc91b77:::1000187bb70.00000009:head on disk 
size (0) does not match object info size (1835008) adjusted for ondisk 
to (1835008)
I looked at the objects on the 3 OSD's on their respective hosts and 
they are the same, zero length files:

# cd ~ceph/osd/ceph-158/current/6.13f1_head
# find . -name *1000187bb70* -ls
669738    0 -rw-r--r--   1 ceph     ceph            0 Dec 13 17:00 
./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.00000009__head_EED893F1__6

# cd ~ceph/osd/ceph-215/current/6.13f1_head
# find . -name *1000187bb70* -ls
539815647    0 -rw-r--r--   1 ceph     ceph            0 Dec 13 17:00 
./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.00000009__head_EED893F1__6

# cd ~ceph/osd/ceph-364/current/6.13f1_head
# find . -name *1000187bb70* -ls
1881432215    0 -rw-r--r--   1 ceph     ceph            0 Dec 13 17:00 
./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.00000009__head_EED893F1__6

At the time of the write, there wasn't anything unusual going on as far 
as I can tell (no hardware/network issues, all processes were up, etc).

This pool is a CephFS data pool, and the corresponding file (inode hex 
1000187bb70, decimal 1099537300336) looks like this:

# ls -li chr4.tags.tsv
1099537300336 -rw-r--r-- 1 xichen xichen 14469915 Dec 13 17:01 chr4.tags.tsv

Reading the file is also ok (no errors, right number of bytes):
# cat chr4.tags.tsv > /dev/null
# wc chr4.tags.tsv
  592251  2961255 14469915 chr4.tags.tsv

We are using the standard 4MB block size for CephFS, and if I interpret 
this right, this is the 9th chunk, so there shouldn't be any data (or 
even a 9th chunk), since the file is only 14MB.  Should I run pg repair 
on this?  Any ideas on how this could come about? Any other recommendations?

Thanks,

Andras
apataki@xxxxxxxxxxx

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com