Re: Random health OSD_SCRUB_ERRORS on various OSDs, after pg repair back to HEALTH_OK

"Marco Baldini - H.S. Amiata" <mbaldini@xxxxxxxxxxx> · Mon, 5 Mar 2018 16:53:48 +0100



    Hi
    I just posted in the ceph tracker with my logs and my issue
    Let's hope this will be fixed
    Thanks

    
    Il 05/03/2018 13:36, Paul Emmerich ha
      scritto:

    
          Hi,

            
          yeah, the cluster that I'm seeing this on also has only one
          host that reports that specific checksum. Two other hosts only
          report the same error that you are seeing.

        
        Could you post to the tracker issue that you are also
          seeing this?

        
        Paul

      
        2018-03-05 12:21 GMT+01:00 Marco
          Baldini - H.S. Amiata <mbaldini@xxxxxxxxxxx>:

          
              Hi

              
              After some days with debug_osd 5/5 I found [ERR] in
                different days, different PGs, different OSDs, different
                hosts. This is what I get in the OSD logs:
              OSD.5 (host 3)
2018-03-01 20:30:02.702269 7fdf4d515700  2 osd.5 pg_epoch: 16486 pg[9.1c( v 16486'51798 (16431'50251,16486'51798] local-lis/les=16474/16475 n=3629 ec=1477/1477 lis/c 16474/16474 les/c/f 16475/16477/0 16474/16474/16474) [5,6] r=0 lpr=16474 crt=16486'51798 lcod 16486'51797 mlcod 16486'51797 active+clean+scrubbing+deep] 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
2018-03-01 20:30:02.702278 7fdf4d515700 -1 log_channel(cluster) log [ERR] : 9.1c shard 6: soid 9:3b157c56:::rbd_data.1526386b8b4567.0000000000001761:head candidate had a read error
              
OSD.4 (host 3)
2018-02-28 00:03:33.458558 7f112cf76700 -1 log_channel(cluster) log [ERR] : 13.65 shard 2: soid 13:a719ecdf:::rbd_data.5f65056b8b4567.000000000000f8eb:head candidate had a read error
              OSD.8 (host 2)
2018-02-27 23:55:15.100084 7f4dd0816700 -1 log_channel(cluster) log [ERR] : 14.31 shard 1: soid 14:8cc6cd37:::rbd_data.30b15b6b8b4567.00000000000081a1:head candidate had a read error
              I don't know what this error is meaning, and as always
                a ceph pg repair fixes it. I don't think this is normal.
              Ideas?
              Thanks

              
                  Il
                    28/02/2018 14:48, Marco Baldini - H.S. Amiata ha
                    scritto:

                  
                    Hi
                    I read the bugtracker issue and it seems a lot
                      like my problem, even if I can't check the
                      reported checksum because I don't have it in my
                      logs, perhaps it's because of debug osd = 0/0 in
                      ceph.conf
                    I just raised the OSD log level
                    ceph tell osd.* injectargs --debug-osd 5/5
                    I'll check OSD logs in the next days...
                    Thanks 

                    
                    Il
                      28/02/2018 11:59, Paul Emmerich ha scritto:

                    
                     Hi,
                      

                      might be http://tracker.ceph.com/issues/22464
                      

                      Can you check the OSD log file to see if the
                        reported checksum is 0x6706be76?
                      

                      Paul
                      

                            Am 28.02.2018 um 11:43 schrieb Marco
                              Baldini - H.S. Amiata <mbaldini@xxxxxxxxxxx>:
                            

                                Hello
                                I have a little ceph cluster with 3
                                  nodes, each with 3x1TB HDD and 1x240GB
                                  SSD. I created this cluster after
                                  Luminous release, so all OSDs are
                                  Bluestore. In my crush map I have two
                                  rules, one targeting the SSDs and one
                                  targeting the HDDs. I have 4 pools,
                                  one using the SSD rule and the others
                                  using the HDD rule, three pools are
                                  size=3 min_size=2, one is size=2
                                  min_size=1 (this one have content that
                                  it's ok to lose)
                                In the last 3 month I'm having a
                                  strange random problem. I planned my
                                  osd scrubs during the night (osd scrub
                                  begin hour = 20, osd scrub end hour =
                                  7) when office is closed so there is
                                  low impact on the users. Some
                                  mornings, when I ceph the cluster
                                  health, I find: 

                                
                                HEALTH_ERR X scrub errors; Possible data damage: Y pgs inconsistent
OSD_SCRUB_ERRORS X scrub errors
PG_DAMAGED Possible data damage: Y pg inconsistent
                                X and Y sometimes are 1, sometimes 2.
                                I issue a ceph health detail, check
                                  the damaged PGs, and run a ceph pg
                                  repair for the damaged PGs, I get
                                instructing pg PG on osd.N to repair
                                PG are different, OSD that have to
                                  repair PG is different, even the node
                                  hosting the OSD is different, I made a
                                  list of all PGs and OSDs. This morning
                                  is the most recent case:
                                > ceph health detail
HEALTH_ERR 2 scrub errors; Possible data damage: 2 pgs inconsistent
OSD_SCRUB_ERRORS 2 scrub errors
PG_DAMAGED Possible data damage: 2 pgs inconsistent
pg 13.65 is active+clean+inconsistent, acting [4,2,6]
pg 14.31 is active+clean+inconsistent, acting [8,3,1]

                                > ceph pg repair 13.65
instructing pg 13.65 on osd.4 to repair

(node-2)> tail /var/log/ceph/ceph-osd.4.log
2018-02-28 08:38:47.593447 7f112cf76700  0 log_channel(cluster) log [DBG] : 13.65 repair starts
2018-02-28 08:39:37.573342 7f112cf76700  0 log_channel(cluster) log [DBG] : 13.65 repair ok, 0 fixed
                                > ceph pg repair 14.31
instructing pg 14.31 on osd.8 to repair

(node-3)> tail /var/log/ceph/ceph-osd.8.log
2018-02-28 08:52:37.297490 7f4dd0816700  0 log_channel(cluster) log [DBG] : 14.31 repair starts
2018-02-28 08:53:00.704020 7f4dd0816700  0 log_channel(cluster) log [DBG] : 14.31 repair ok, 0 fixed


                                I made a list of when I got
                                  OSD_SCRUB_ERRORS, what PG and what OSD
                                  had to repair PG. Date is dd/mm/yyyy

                                
                                21/12/2017   --  pg 14.29 is active+clean+inconsistent, acting [6,2,4]

18/01/2018   --  pg 14.5a is active+clean+inconsistent, acting [6,4,1]

22/01/2018   --  pg 9.3a is active+clean+inconsistent, acting [2,7]

29/01/2018   --  pg 13.3e is active+clean+inconsistent, acting [4,6,1]
                 instructing pg 13.3e on osd.4 to repair

07/02/2018   --  pg 13.7e is active+clean+inconsistent, acting [8,2,5]
                 instructing pg 13.7e on osd.8 to repair

09/02/2018   --  pg 13.30 is active+clean+inconsistent, acting [7,3,2]
                 instructing pg 13.30 on osd.7 to repair

15/02/2018   --  pg 9.35 is active+clean+inconsistent, acting [1,8]
                 instructing pg 9.35 on osd.1 to repair

                 pg 13.3e is active+clean+inconsistent, acting [4,6,1]
                 instructing pg 13.3e on osd.4 to repair

17/02/2018   --  pg 9.2d is active+clean+inconsistent, acting [7,5]
                 instructing pg 9.2d on osd.7 to repair                 

22/02/2018   --  pg 9.24 is active+clean+inconsistent, acting [5,8]
                 instructing pg 9.24 on osd.5 to repair

28/02/2018   --  pg 13.65 is active+clean+inconsistent, acting [4,2,6]
                 instructing pg 13.65 on osd.4 to repair

                 pg 14.31 is active+clean+inconsistent, acting [8,3,1]
                 instructing pg 14.31 on osd.8 to repair


                                If can be useful, my ceph.conf is
                                  here:
                                [global]
auth client required = none
auth cluster required = none
auth service required = none
fsid = 24d5d6bc-0943-4345-b44e-46c19099004b
cluster network = 10.10.10.0/24
public network = 10.10.10.0/24
keyring = /etc/pve/priv/$cluster.$name.keyring
mon allow pool delete = true
osd journal size = 5120
osd pool default min size = 2
osd pool default size = 3
bluestore_block_db_size = 64424509440

debug asok = 0/0
debug auth = 0/0
debug buffer = 0/0
debug client = 0/0
debug context = 0/0
debug crush = 0/0
debug filer = 0/0
debug filestore = 0/0
debug finisher = 0/0
debug heartbeatmap = 0/0
debug journal = 0/0
debug journaler = 0/0
debug lockdep = 0/0
debug mds = 0/0
debug mds balancer = 0/0
debug mds locker = 0/0
debug mds log = 0/0
debug mds log expire = 0/0
debug mds migrator = 0/0
debug mon = 0/0
debug monc = 0/0
debug ms = 0/0
debug objclass = 0/0
debug objectcacher = 0/0
debug objecter = 0/0
debug optracker = 0/0
debug osd = 0/0
debug paxos = 0/0
debug perfcounter = 0/0
debug rados = 0/0
debug rbd = 0/0
debug rgw = 0/0
debug throttle = 0/0
debug timer = 0/0
debug tp = 0/0


[osd]
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd max backfills = 1
osd recovery max active = 1

osd scrub begin hour = 20
osd scrub end hour = 7
osd scrub during recovery = false
osd scrub load threshold = 0.3

[client]
rbd cache = true
rbd cache size = 268435456      # 256MB
rbd cache max dirty = 201326592    # 192MB
rbd cache max dirty age = 2
rbd cache target dirty = 33554432    # 32MB
rbd cache writethrough until flush = true


#[mgr]
#debug_mgr = 20


[mon.pve-hs-main]
host = pve-hs-main
mon addr = 10.10.10.251:6789

[mon.pve-hs-2]
host = pve-hs-2
mon addr = 10.10.10.252:6789

[mon.pve-hs-3]
host = pve-hs-3
mon addr = 10.10.10.253:6789


                                My ceph versions:
                                {
    "mon": {
        "ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
    },
    "mgr": {
        "ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 3
    },
    "osd": {
        "ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 12
    },
    "mds": {},
    "overall": {
        "ceph version 12.2.2 (215dd7151453fae88e6f968c975b6ce309d42dcf) luminous (stable)": 18
    }
}


                                My ceph osd tree:
                                ID CLASS WEIGHT  TYPE NAME            STATUS REWEIGHT PRI-AFF
-1       8.93686 root default
-6       2.94696     host pve-hs-2
 3   hdd 0.90959         osd.3            up  1.00000 1.00000
 4   hdd 0.90959         osd.4            up  1.00000 1.00000
 5   hdd 0.90959         osd.5            up  1.00000 1.00000
10   ssd 0.21819         osd.10           up  1.00000 1.00000
-3       2.86716     host pve-hs-3
 6   hdd 0.85599         osd.6            up  1.00000 1.00000
 7   hdd 0.85599         osd.7            up  1.00000 1.00000
 8   hdd 0.93700         osd.8            up  1.00000 1.00000
11   ssd 0.21819         osd.11           up  1.00000 1.00000
-7       3.12274     host pve-hs-main
 0   hdd 0.96819         osd.0            up  1.00000 1.00000
 1   hdd 0.96819         osd.1            up  1.00000 1.00000
 2   hdd 0.96819         osd.2            up  1.00000 1.00000
 9   ssd 0.21819         osd.9            up  1.00000 1.00000


                                My pools:
                                pool 9 'cephbackup' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 5665 flags hashpspool stripe_width 0 application rbd
        removed_snaps [1~3]
pool 13 'cephwin' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16454 flags hashpspool stripe_width 0 application rbd
        removed_snaps [1~5]
pool 14 'cephnix' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 16482 flags hashpspool stripe_width 0 application rbd
        removed_snaps [1~227]
pool 17 'cephssd' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 8601 flags hashpspool stripe_width 0 application rbd
        removed_snaps [1~3]


                                I can't understand where the problem
                                  comes from, I don't think it's
                                  hardware, if I had a failed disk, then
                                  I should have problems always on the
                                  same OSD. Any ideas
                                Thanks

                                
                                --
                                  

                                        Marco Baldini
                                      
                                      
                                        H.S. Amiata
                                            Srl
                                      
                                      
                                        Ufficio:  
                                        0577-779396
                                      
                                      
                                        Cellulare:  
                                        335-8765169
                                      
                                      
                                        WEB:  
                                        www.hsamiata.it
                                      
                                      
                                        EMAIL:  
                                        mbaldini@xxxxxxxxxxx
                                      
                                    
                              _______________________________________________

                              ceph-users mailing list

                              ceph-users@xxxxxxxxxxxxxx

                              http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

                            
                          -- 
                          
                            Mit freundlichen Grüßen / Best Regards
                            Paul Emmerich
                            

                            croit GmbH
                            Freseniusstr. 31h
                            81247 München
                            www.croit.io
                            Tel: +49 89 1896585 90
                            

                            Geschäftsführer: Martin Verges
                            Handelsregister: Amtsgericht München
                            USt-IdNr: DE310638492
                          
                        
                    -- 

                      
                            Marco Baldini
                          
                          
                            H.S. Amiata Srl
                          
                          
                            Ufficio:  
                            0577-779396
                          
                          
                            Cellulare:  
                            335-8765169
                          
                          
                            WEB:  
                            www.hsamiata.it
                          
                          
                            EMAIL:  
                            mbaldini@xxxxxxxxxxx
                          
                        
                    _______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

                  
                  -- 

                    
                          Marco Baldini
                        
                        
                          H.S. Amiata Srl
                        
                        
                          Ufficio:  
                          0577-779396
                        
                        
                          Cellulare:  
                          335-8765169
                        
                        
                          WEB:  
                          www.hsamiata.it
                        
                        
                          EMAIL:  
                          mbaldini@xxxxxxxxxxx
                        
                      
            _______________________________________________

            ceph-users mailing list

            ceph-users@xxxxxxxxxxxxxx

            http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

            
        -- 

        
          -- 

            Paul Emmerich

            
            croit GmbH

            Freseniusstr. 31h

            81247 München

            www.croit.io

            Tel: +49 89 1896585 90

          
    -- 

      
            Marco Baldini
          
          
            H.S. Amiata Srl
          
          
            Ufficio:  
            0577-779396
          
          
            Cellulare:  
            335-8765169
          
          
            WEB:  
            www.hsamiata.it
          
          
            EMAIL:  
            mbaldini@xxxxxxxxxxx
          
        
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com