Re: Ceph pg active+clean+inconsistent

Andras Pataki <apataki@xxxxxxxxxxxxxxxxxxxx> · Wed, 21 Dec 2016 09:44:49 -0500



    Yes, size = 3, and I have checked that all three replicas are the
    same zero length object on the disk.  I think some metadata info is
    mismatching what the OSD log refers to as "object info size".  But
    I'm not sure what to do about it.  pg repair does not fix it.  In
    fact, the file this object corresponds to in CephFS is shorter so
    this chunk shouldn't even exist I think (details are in the original
    email).  Although I may be understanding the situation wrong ...

    
    Andras

    
    On 12/21/2016 07:17 AM, Mehmet wrote:

    
      Hi Andras,

      
      Iam not the experienced User but i guess you could have a look on
      this object on each related osd for the pg, compare them and
      delete the Different object. I assume you have size = 3.

      
      Then again pg repair.

      
      But be carefull iirc the replica will be recovered from the
      primary pg.

      
      Hth

      
      Am 20. Dezember 2016 22:39:44 MEZ,
        schrieb Andras Pataki <apataki@xxxxxxxxxxxxxxxxxxxx>:
         Hi cephers,

          
          Any ideas on how to proceed on the inconsistencies below?  At
          the moment our ceph setup has 5 of these - in all cases it
          seems like some zero length objects that match across the
          three replicas, but do not match the object info size.  I
          tried running pg repair on one of them, but it didn't repair
          the problem:

          
          2016-12-20 16:24:40.870307 7f3e1a4b1700  0
              log_channel(cluster) log [INF] : 6.92c repair starts

            2016-12-20 16:27:06.183186 7f3e1a4b1700 -1
              log_channel(cluster) log [ERR] : repair 6.92c
              6:34932257:::1000187bbb5.00000009:head on disk size (0)
              does not match object info size (3014656) adjusted for
              ondisk to (3014656)

            2016-12-20 16:27:35.885496 7f3e17cac700 -1
              log_channel(cluster) log [ERR] : 6.92c repair 1 errors, 0
              fixed
          

          Any help/hints would be appreciated.

          
          Thanks,

          
          Andras

          
          On 12/15/2016 10:13 AM, Andras
            Pataki wrote:

          
          Hi everyone, 

            
            Yesterday scrubbing turned up an inconsistency in one of our
            placement groups.  We are running ceph 10.2.3, using CephFS
            and RBD for some VM images. 

            
            [root@hyperv017 ~]# ceph -s 

                cluster d7b33135-0940-4e48-8aa6-1d2026597c2f 

                 health HEALTH_ERR 

                        1 pgs inconsistent 

                        1 scrub errors 

                        noout flag(s) set 

                 monmap e15: 3 mons at
{hyperv029=10.4.36.179:6789/0,hyperv030=10.4.36.180:6789/0,hyperv031=10.4.36.181:6789/0}

                        election epoch 27192, quorum 0,1,2
            hyperv029,hyperv030,hyperv031 

                  fsmap e17181: 1/1/1 up {0=hyperv029=up:active}, 2
            up:standby 

                 osdmap e342930: 385 osds: 385 up, 385 in 

                        flags noout 

                  pgmap v37580512: 34816 pgs, 5 pools, 673 TB data, 198
            Mobjects 

                        1583 TB used, 840 TB / 2423 TB avail 

                           34809 active+clean 

                               4 active+clean+scrubbing+deep 

                               2 active+clean+scrubbing 

                               1 active+clean+inconsistent 

              client io 87543 kB/s rd, 671 MB/s wr, 23 op/s rd, 2846
            op/s wr 

            
            # ceph pg dump | grep inconsistent 

            6.13f1  4692    0       0       0       0 16057314767    
            3087    3087    active+clean+inconsistent 2016-12-14
            16:49:48.391572      342929'41011    342929:43966
            [158,215,364]   158     [158,215,364]   158     342928'40540
            2016-12-14 16:49:48.391511      342928'40540    2016-12-14
            16:49:48.391511 

            
            I tried a couple of other deep scrubs on pg 6.13f1 but got
            repeated errors.  In the OSD logs: 

            
            2016-12-14 16:48:07.733291 7f3b56e3a700 -1
            log_channel(cluster) log [ERR] : deep-scrub 6.13f1
            6:8fc91b77:::1000187bb70.00000009:head on disk size (0) does
            not match object info size (1835008) adjusted for ondisk to
            (1835008) 

            I looked at the objects on the 3 OSD's on their respective
            hosts and they are the same, zero length files: 

            
            # cd ~ceph/osd/ceph-158/current/6.13f1_head 

            # find . -name *1000187bb70* -ls 

            669738    0 -rw-r--r--   1 ceph     ceph            0 Dec 13
            17:00
./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.00000009__head_EED893F1__6 

            
            # cd ~ceph/osd/ceph-215/current/6.13f1_head 

            # find . -name *1000187bb70* -ls 

            539815647    0 -rw-r--r--   1 ceph     ceph            0 Dec
            13 17:00
            ./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.00000009__head_EED893F1__6
            

            # cd ~ceph/osd/ceph-364/current/6.13f1_head 

            # find . -name *1000187bb70* -ls 

            1881432215    0 -rw-r--r--   1 ceph     ceph            0
            Dec 13 17:00
            ./DIR_1/DIR_F/DIR_3/DIR_9/DIR_8/1000187bb70.00000009__head_EED893F1__6
            

            At the time of the write, there wasn't anything unusual
            going on as far as I can tell (no hardware/network issues,
            all processes were up, etc). 

            
            This pool is a CephFS data pool, and the corresponding file
            (inode hex 1000187bb70, decimal 1099537300336) looks like
            this: 

            
            # ls -li chr4.tags.tsv 

            1099537300336 -rw-r--r-- 1 xichen xichen 14469915 Dec 13
            17:01 chr4.tags.tsv 

            
            Reading the file is also ok (no errors, right number of
            bytes): 

            # cat chr4.tags.tsv > /dev/null 

            # wc chr4.tags.tsv 

              592251  2961255 14469915 chr4.tags.tsv 

            
            We are using the standard 4MB block size for CephFS, and if
            I interpret this right, this is the 9th chunk, so there
            shouldn't be any data (or even a 9th chunk), since the file
            is only 14MB.  Should I run pg repair on this?  Any ideas on
            how this could come about? Any other recommendations? 

            
            Thanks, 

            
            Andras 

            apataki@xxxxxxxxxxx
            

ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com