Re: [Solved] Oeps: lost cluster with: ceph osd require-osd-release luminous

Jan-Willem Michels <jwillem@xxxxxxxxx> · Wed, 13 Sep 2017 08:28:28 +0200



    On 9/12/17 9:13 PM, Josh Durgin wrote:

    
        Could you
          post your crushmap? PGs mapping to no OSDs is a symptom of
          something wrong there.
        

        You can stop
          the osds from changing position at startup with 'osd crush
          update on start = false':
        

    Yes I  had found that. Thanks. Seems be be by design, which we
    didn't understand.

    We will try device classes.

    
        http://docs.ceph.com/docs/master/rados/operations/crush-map/#crush-location
        

    My "Big" problem, turned out to be a cosmetic problem. 

    Although the whole problem looks quite ugly, and every metric is 0
    where ever you look.

    And you can't really use any ceph management anymore.

    
    But the whole system kept functioning. Since it was a remote test
    site I didn't notice that earlier.

    
    So the whole problem was that the MGR server where up. But that a
    firewall prevented contact.

    At the moment that "ceph osd require-osd-release luminous" was set
    the old compatible metric system stopped working and switched to the
    mandatory MGR servers.

    And then you get this types of 0 messages. 

    
    So even without visual management, and administrator thinking it was
    dead, ceph kept running. 

    One could  say that Ceph also managed to create a succesfull upgrade
    path to 12.2. Well done. 

    
    Thanks for your time

    
    The only minor problem left is with scrub error and pg repair doing
    nothing. 

    And because of bluestore not easy acces to the files

    
    rados list-inconsistent-pg  default.rgw.buckets.data      
    ["15.720"]

    rados list-inconsistent-obj 15.720 --format=json-pretty

    No scrub information available for pg 15.720

    error 2: (2) No such file or directory

    
    Other people seem to also have  this problem.
    http://tracker.ceph.com/issues/15781

    I've read that perhaps a better pg-repair will be build. Will wait
    for that. 

    
        Sent from Nine
      
      
          From: Jan-Willem Michels
              <jwillem@xxxxxxxxx>

              Sent: Sep 11, 2017 23:50

              To: ceph-users@xxxxxxxxxxxxxx

              Subject:  Oeps: lost cluster with: ceph
              osd require-osd-release luminous

            
            We have a kraken cluster,  at the time newly
              build, with bluestore enabled.
              

              it is 8 systems, with each 10 disks 10TB ,  and each
              computer has 1 NVME 

              2TB disk
              

              3 monitor etc
              

              About 700 TB and 300TB used. Mainly S3 objectstore
              

              Of course there is more to the story:  We have one strange
              thing in our 

              cluster.
              

              We tried  to create two pools of storage, default and ssd,
              and created a 

              new crush rule.
              

              Worked without problems for months
              

              But when we restart a computer / nvme-osd, it would
              "forget" that the 

              nvme should be connected the SSD pool ( for that
              particular computer).
              

              Since we don't restart systems, we didn't notice that.
              

              The nvme would appear back a default  pool.
              

              When we re-apply the same crush rule again it would go
              back to the SSD 

              pool.
              

              All while data kept working on the nvme disks
              

              Clearly something is not ideal there. And luminous has a
              different 

              approach to separating  SSD from HDD.
              

              So we thought first go to luminous 12.2.0 and later see
              how we fix this.
              

              We did an upgrade to luminous and that went well. That
              requires a reboot 

              / restart off osd's, so all nvme devices where a default.
              

              Reapplying the crush rule  brought them back to the ssd
              pool.
              

              Also while doing the upgrade we switched off in ceph.conf
              the rule:
              

              # enable experimental unrecoverable data corrupting
              features = 

              bluestore, sine in luminous that was no problem
              

              Everything was working fine.
              

              In Ceph -s we had this health warning
              

                           all OSDs are running luminous or later but 

              require_osd_release < luminous
              

              So i thought i would set the minimum  OSD version to
              luminous with;
              

              ceph osd require-osd-release luminous
              

              To us that seemed nothing more than a minimum software
              version that was 

              required to connect tot the cluster
              

              the system answered back
              

              recovery_deletes is set
              

              and that was it, the same second, ceph-s went to "0"
              

                ceph -s
              

                 cluster:
              

                   id:     5bafad08-31b2-4716-be77-07ad2e2647eb
              

                   health: HEALTH_WARN
              

                           noout flag(s) set
              

                           Reduced data availability: 3248 pgs inactive
              

                           Degraded data redundancy: 3248 pgs unclean
              

                 services:
              

                   mon: 3 daemons, quorum Ceph-Mon1,Ceph-Mon2,Ceph-Mon3
              

                   mgr: Ceph-Mon2(active), standbys: Ceph-Mon3,
              Ceph-Mon1
              

                   osd: 88 osds: 88 up, 88 in; 297 remapped pgs
              

                        flags noout
              

                 data:
              

                   pools:   26 pools, 3248 pgs
              

                   objects: 0 objects, 0 bytes
              

                   usage:   0 kB used, 0 kB / 0 kB avail
              

                   pgs:     100.000% pgs unknown
              

                            3248 unknown
              

              And it was something like this. The errors (apart  from
              the scrub error) 

              you see would where from the upgrade / restarting, and I
              would expect 

              them to go away very fast.
              

              ceph -s
              

                 cluster:
              

                   id:     5bafad08-31b2-4716-be77-07ad2e2647eb
              

                   health: HEALTH_ERR
              

                           385 pgs backfill_wait
              

                           5 pgs backfilling
              

                           135 pgs degraded
              

                           1 pgs inconsistent
              

                           1 pgs peering
              

                           4 pgs recovering
              

                           131 pgs recovery_wait
              

                           98 pgs stuck degraded
              

                           525 pgs stuck unclean
              

                           recovery 119/612465488 objects degraded
              (0.000%)
              

                           recovery 24/612465488 objects misplaced
              (0.000%)
              

                           1 scrub errors
              

                           noout flag(s) set
              

                           all OSDs are running luminous or later but 

              require_osd_release < luminous
              

                 services:
              

                   mon: 3 daemons, quorum Ceph-Mon1,Ceph-Mon2,Ceph-Mon3
              

                   mgr: Ceph-Mon2(active), standbys: Ceph-Mon1,
              Ceph-Mon3
              

                   osd: 88 osds: 88 up, 88 in; 387 remapped pgs
              

                        flags noout
              

                 data:
              

                   pools:   26 pools, 3248 pgs
              

                   objects: 87862k objects, 288 TB
              

                   usage:   442 TB used, 300 TB / 742 TB avail
              

                   pgs:     0.031% pgs not active
              

                            119/612465488 objects degraded (0.000%)
              

                            24/612465488 objects misplaced (0.000%)
              

                            2720 active+clean
              

                            385  active+remapped+backfill_wait
              

                            131  active+recovery_wait+degraded
              

                            5    active+remapped+backfilling
              

                            4    active+recovering+degraded
              

                            1    active+clean+inconsistent
              

                            1    peering
              

                            1    active+clean+scrubbing+deep
              

                 io:
              

                   client:   34264 B/s rd, 2091 kB/s wr, 38 op/s rd, 48
              op/s wr
              

                   recovery: 4235 kB/s, 6 objects/s
              

              current ceph health detail
              

              HEALTH_WARN noout flag(s) set; Reduced data availability:
              3248 pgs 

              inactive; Degraded data redundancy: 3248 pgs unclean
              

              OSDMAP_FLAGS noout flag(s) set
              

              PG_AVAILABILITY Reduced data availability: 3248 pgs
              inactive
              

                   pg 15.7cd is stuck inactive for 24780.157341, current
              state 

              unknown, last acting []
              

                   pg 15.7ce is stuck inactive for 24780.157341, current
              state 

              unknown, last acting []
              

                   pg 15.7cf is stuck inactive for 24780.157341, current
              state 

              unknown, last acting []
              

              ..
              

                   pg 15.7ff is stuck inactive for 24728.059692, current
              state 

              unknown, last acting []
              

              PG_DEGRADED Degraded data redundancy: 3248 pgs unclean
              

                   pg 15.7cd is stuck unclean for 24728.059692, current
              state unknown, 

              last acting []
              

                   pg 15.7ce is stuck unclean for 24728.059692, current
              state unknown, 

              last acting []
              

              ....
              

                   pg 15.7fc is stuck unclean for 21892.783340, current
              state unknown, 

              last acting []
              

                   pg 15.7fd is stuck unclean for 21892.783340, current
              state unknown, 

              last acting []
              

                   pg 15.7fe is stuck unclean for 21892.783340, current
              state unknown, 

              last acting []
              

                   pg 15.7ff is stuck unclean for 21892.783340, current
              state unknown, 

              last acting []
              

                ceph pg dump_stuck unclean | more
              

              15.46b  unknown []         -1     []             -1
              

              15.46a  unknown []         -1     []             -1
              

              15.469  unknown []         -1     []             -1
              

              15.468  unknown []         -1     []             -1
              

              15.467  unknown []         -1     []             -1
              

              15.466  unknown []         -1     []             -1
              

              15.465  unknown []         -1     []             -1
              

              15.464  unknown []         -1     []             -1
              

              15.463  unknown []         -1     []             -1
              

              ....
              

              Any idea's
              

              Greetings JW
              

              _______________________________________________
              

              ceph-users mailing list
              

              ceph-users@xxxxxxxxxxxxxx
              

              http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
              

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com