Re: - cluster stuck and undersized if at least one osd is down

David Turner <david.turner@xxxxxxxxxxxxxxxx> · Mon, 28 Nov 2016 15:12:02 +0000

In the cluster your OSD is down, not out.  When an osd goes out, that is when the data will start to rebuild.  Once the osd is marked out, it will show as 11/11 osds are up instead
 of 1/12 osds are down.

David Turner |
Cloud Operations Engineer |
StorageCraft
 Technology Corporation

380 Data Drive Suite 300 |
Draper |
Utah |
84020

Office:
801.871.2760 |
Mobile:
385.224.2943

If you are not the intended recipient of this message or received it erroneously, please notify the sender and delete it, together with any attachments, and be advised that any dissemination or copying of this
 message is prohibited.

From: ceph-users [ceph-users-bounces@xxxxxxxxxxxxxx] on behalf of Piotr Dzionek [piotr.dzionek@xxxxxxxx]

Sent: Monday, November 28, 2016 4:54 AM

To: ceph-users@xxxxxxxxxxxxxx

Subject: [ceph-users] - cluster stuck and undersized if at least one osd is down

Hi,

I recently installed 3 nodes ceph cluster v.10.2.3. It has 3 mons, and 12 osds. I removed default pool and created the following one:

pool 7 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 126 flags hashpspool stripe_width 0
Cluster is healthy if all osds are up, however if I stop any of the osds, it becomes stuck and undersized - it is not rebuilding.

    cluster *****

     health HEALTH_WARN

            166 pgs degraded

            108 pgs stuck unclean

            166 pgs undersized

            recovery 67261/827220 objects degraded (8.131%)

            1/12 in osds are down

     monmap e3: 3 mons at {**osd01=***.144:6789/0,***osd02=***.145:6789/0,**osd03=*****.146:6789/0}

            election epoch 14, quorum 0,1,2 **osd01,**osd02,**osd03

     osdmap e161: 12 osds: 11 up, 12 in; 166 remapped pgs

            flags sortbitwise

      pgmap v307710: 1024 pgs, 1 pools, 1230 GB data, 403 kobjects

            2452 GB used, 42231 GB / 44684 GB avail

            67261/827220 objects degraded (8.131%)

                 858 active+clean

                 166 active+undersized+degraded

Replica size is 2 and and I use the following crushmap:
# begin crush map

tunable choose_local_tries 0

tunable choose_local_fallback_tries 0

tunable choose_total_tries 50

tunable chooseleaf_descend_once 1

tunable chooseleaf_vary_r 1

tunable straw_calc_version 1

# devices

device 0 osd.0

device 1 osd.1

device 2 osd.2

device 3 osd.3

device 4 osd.4

device 5 osd.5

device 6 osd.6

device 7 osd.7

device 8 osd.8

device 9 osd.9

device 10 osd.10

device 11 osd.11

# types

type 0 osd

type 1 host

type 2 chassis

type 3 rack

type 4 row

type 5 pdu

type 6 pod

type 7 room

type 8 datacenter

type 9 region

type 10 root

# buckets

host osd01 {

        id -2           # do not change unnecessarily

        # weight 14.546

        alg straw

        hash 0  # rjenkins1

        item osd.0 weight 3.636

        item osd.1 weight 3.636

        item osd.2 weight 3.636

        item osd.3 weight 3.636

}

host osd02 {

        id -3           # do not change unnecessarily

        # weight 14.546

        alg straw

        hash 0  # rjenkins1

        item osd.4 weight 3.636

        item osd.5 weight 3.636

        item osd.6 weight 3.636

        item osd.7 weight 3.636

}

host osd03 {

        id -4           # do not change unnecessarily

        # weight 14.546

        alg straw

        hash 0  # rjenkins1

        item osd.8 weight 3.636

        item osd.9 weight 3.636

        item osd.10 weight 3.636

        item osd.11 weight 3.636

}

root default {

        id -1           # do not change unnecessarily

        # weight 43.637

        alg straw

        hash 0  # rjenkins1

        item osd01 weight 14.546

        item osd02 weight 14.546

        item osd03 weight 14.546

}

# rules

rule replicated_ruleset {

        ruleset 0

        type replicated

        min_size 1

        max_size 10

        step take default

        step chooseleaf firstn 0 type host

        step emit

}

# end crush map

I am not sure what is the reason for undersized state. All osd disks are the same size and replica size is 2. Also data is only replicated per hosts basis and I have 3 separate hosts. Maybe number of pg is incorrect ?  Is 1024 too big ? or maybe there is some
 misconfiguration in crushmap ?

Kind regards,

Piotr Dzionek

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com