Re: - cluster stuck and undersized if at least one osd is down

Piotr Dzionek <piotr.dzionek@xxxxxxxx> · Tue, 29 Nov 2016 12:37:58 +0100



    Hi, 

    
    You are right I missed that there is default time out for
      changing state from in to out for down osd. "mon osd
        down out interval" : 300 and I didn't wait long enough before
          starting it again. 

        
    Kind regards,
    Piotr Dzionek

        
    W dniu 28.11.2016 o 16:12, David Turner
      pisze:

    
      In the cluster your OSD is down, not
        out.  When an osd goes out, that is when the data will start to
        rebuild.  Once the osd is marked out, it will show as 11/11 osds
        are up instead of 1/12 osds are down.

      
              David Turner |
                Cloud
                  Operations Engineer |
                StorageCraft Technology
                      Corporation

                380
                  Data Drive Suite 300 |
                  Draper |
                Utah |
                84020

                Office:
                801.871.2760 |
                Mobile:
                385.224.2943
            
          
              If you
                  are not the intended recipient of this message or
                  received it erroneously, please notify the sender and
                  delete it, together with any attachments, and be
                  advised that any dissemination or copying of this
                  message is prohibited.
            
          
          From:
              ceph-users [ceph-users-bounces@xxxxxxxxxxxxxx] on behalf
              of Piotr Dzionek [piotr.dzionek@xxxxxxxx]

              Sent: Monday, November 28, 2016 4:54 AM

              To: ceph-users@xxxxxxxxxxxxxx

              Subject:  - cluster stuck and
              undersized if at least one osd is down

            
            Hi,

              I recently installed 3 nodes ceph cluster v.10.2.3. It has
              3 mons, and 12 osds. I removed default pool and created
              the following one:

              
              pool 7 'data' replicated size 2
                  min_size 1 crush_ruleset 0 object_hash rjenkins pg_num
                  1024 pgp_num 1024 last_change 126 flags hashpspool
                  stripe_width 0
            Cluster is healthy if all osds are up, however if I stop
              any of the osds, it becomes stuck and undersized - it is
              not rebuilding.

              
                  cluster *****

                       health HEALTH_WARN

                              166 pgs degraded

                              108 pgs stuck unclean

                              166 pgs undersized

                              recovery 67261/827220 objects degraded
                  (8.131%)

                              1/12 in osds are down

                       monmap e3: 3 mons at
{**osd01=***.144:6789/0,***osd02=***.145:6789/0,**osd03=*****.146:6789/0}

                              election epoch 14, quorum 0,1,2
                  **osd01,**osd02,**osd03

                       osdmap e161: 12 osds: 11 up, 12 in; 166 remapped
                  pgs

                              flags sortbitwise

                        pgmap v307710: 1024 pgs, 1 pools, 1230 GB data,
                  403 kobjects

                              2452 GB used, 42231 GB / 44684 GB avail

                              67261/827220 objects degraded (8.131%)

                                   858 active+clean

                                   166 active+undersized+degraded

            
            Replica size is 2 and and I use the following crushmap:
            # begin crush map

                  tunable choose_local_tries 0

                  tunable choose_local_fallback_tries 0

                  tunable choose_total_tries 50

                  tunable chooseleaf_descend_once 1

                  tunable chooseleaf_vary_r 1

                  tunable straw_calc_version 1

                  
                  # devices

                  device 0 osd.0

                  device 1 osd.1

                  device 2 osd.2

                  device 3 osd.3

                  device 4 osd.4

                  device 5 osd.5

                  device 6 osd.6

                  device 7 osd.7

                  device 8 osd.8

                  device 9 osd.9

                  device 10 osd.10

                  device 11 osd.11

                  
                  # types

                  type 0 osd

                  type 1 host

                  type 2 chassis

                  type 3 rack

                  type 4 row

                  type 5 pdu

                  type 6 pod

                  type 7 room

                  type 8 datacenter

                  type 9 region

                  type 10 root

                  
                  # buckets

                  host osd01 {

                          id -2           # do not change unnecessarily

                          # weight 14.546

                          alg straw

                          hash 0  # rjenkins1

                          item osd.0 weight 3.636

                          item osd.1 weight 3.636

                          item osd.2 weight 3.636

                          item osd.3 weight 3.636

                  }

                  host osd02 {

                          id -3           # do not change unnecessarily

                          # weight 14.546

                          alg straw

                          hash 0  # rjenkins1

                          item osd.4 weight 3.636

                          item osd.5 weight 3.636

                          item osd.6 weight 3.636

                          item osd.7 weight 3.636

                  }

                  host osd03 {

                          id -4           # do not change unnecessarily

                          # weight 14.546

                          alg straw

                          hash 0  # rjenkins1

                          item osd.8 weight 3.636

                          item osd.9 weight 3.636

                          item osd.10 weight 3.636

                          item osd.11 weight 3.636

                  }

                  root default {

                          id -1           # do not change unnecessarily

                          # weight 43.637

                          alg straw

                          hash 0  # rjenkins1

                          item osd01 weight 14.546

                          item osd02 weight 14.546

                          item osd03 weight 14.546

                  }

                  
                  # rules

                  rule replicated_ruleset {

                          ruleset 0

                          type replicated

                          min_size 1

                          max_size 10

                          step take default

                          step chooseleaf firstn 0 type host

                          step emit

                  }

                  
                  # end crush map

                
              I am not sure what is the reason for undersized state. All
              osd disks are the same size and replica size is 2. Also
              data is only replicated per hosts basis and I have 3
              separate hosts. Maybe number of pg is incorrect ?  Is 1024
              too big ? or maybe there is some misconfiguration in
              crushmap ?

            
            Kind regards,

            Piotr Dzionek
          
        
    -- 
Piotr Dzionek
System Administrator

SEQR Poland Sp. z o.o.
ul. Łąkowa 29, 90-554 Łódź, Poland
Mobile: +48 796555587
Mail: piotr.dzionek@xxxxxxxx
www.seqr.com | www.seamless.se

  
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com