Re: - cluster stuck and undersized if at least one osd is down

Brad Hubbard <bhubbard@xxxxxxxxxx> · Tue, 29 Nov 2016 12:08:45 +1000

On Mon, Nov 28, 2016 at 9:54 PM, Piotr Dzionek <piotr.dzionek@xxxxxxxx> wrote:
> Hi,
> I recently installed 3 nodes ceph cluster v.10.2.3. It has 3 mons, and 12
> osds. I removed default pool and created the following one:
>
> pool 7 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 1024 pgp_num 1024 last_change 126 flags hashpspool
> stripe_width 0

Do you understand the significance of min_size 1?

Are you OK with the likelihood of data loss that this value introduces?

>
> Cluster is healthy if all osds are up, however if I stop any of the osds, it
> becomes stuck and undersized - it is not rebuilding.
>
>     cluster *****
>      health HEALTH_WARN
>             166 pgs degraded
>             108 pgs stuck unclean
>             166 pgs undersized
>             recovery 67261/827220 objects degraded (8.131%)
>             1/12 in osds are down
>      monmap e3: 3 mons at
> {**osd01=***.144:6789/0,***osd02=***.145:6789/0,**osd03=*****.146:6789/0}
>             election epoch 14, quorum 0,1,2 **osd01,**osd02,**osd03
>      osdmap e161: 12 osds: 11 up, 12 in; 166 remapped pgs
>             flags sortbitwise
>       pgmap v307710: 1024 pgs, 1 pools, 1230 GB data, 403 kobjects
>             2452 GB used, 42231 GB / 44684 GB avail
>             67261/827220 objects degraded (8.131%)
>                  858 active+clean
>                  166 active+undersized+degraded
>
> Replica size is 2 and and I use the following crushmap:
>
> # begin crush map
> tunable choose_local_tries 0
> tunable choose_local_fallback_tries 0
> tunable choose_total_tries 50
> tunable chooseleaf_descend_once 1
> tunable chooseleaf_vary_r 1
> tunable straw_calc_version 1
>
> # devices
> device 0 osd.0
> device 1 osd.1
> device 2 osd.2
> device 3 osd.3
> device 4 osd.4
> device 5 osd.5
> device 6 osd.6
> device 7 osd.7
> device 8 osd.8
> device 9 osd.9
> device 10 osd.10
> device 11 osd.11
>
> # types
> type 0 osd
> type 1 host
> type 2 chassis
> type 3 rack
> type 4 row
> type 5 pdu
> type 6 pod
> type 7 room
> type 8 datacenter
> type 9 region
> type 10 root
>
> # buckets
> host osd01 {
>         id -2           # do not change unnecessarily
>         # weight 14.546
>         alg straw
>         hash 0  # rjenkins1
>         item osd.0 weight 3.636
>         item osd.1 weight 3.636
>         item osd.2 weight 3.636
>         item osd.3 weight 3.636
> }
> host osd02 {
>         id -3           # do not change unnecessarily
>         # weight 14.546
>         alg straw
>         hash 0  # rjenkins1
>         item osd.4 weight 3.636
>         item osd.5 weight 3.636
>         item osd.6 weight 3.636
>         item osd.7 weight 3.636
> }
> host osd03 {
>         id -4           # do not change unnecessarily
>         # weight 14.546
>         alg straw
>         hash 0  # rjenkins1
>         item osd.8 weight 3.636
>         item osd.9 weight 3.636
>         item osd.10 weight 3.636
>         item osd.11 weight 3.636
> }
> root default {
>         id -1           # do not change unnecessarily
>         # weight 43.637
>         alg straw
>         hash 0  # rjenkins1
>         item osd01 weight 14.546
>         item osd02 weight 14.546
>         item osd03 weight 14.546
> }
>
> # rules
> rule replicated_ruleset {
>         ruleset 0
>         type replicated
>         min_size 1
>         max_size 10
>         step take default
>         step chooseleaf firstn 0 type host
>         step emit
> }
>
> # end crush map
>
> I am not sure what is the reason for undersized state. All osd disks are the
> same size and replica size is 2. Also data is only replicated per hosts
> basis and I have 3 separate hosts. Maybe number of pg is incorrect ?  Is
> 1024 too big ? or maybe there is some misconfiguration in crushmap ?
>
>
> Kind regards,
> Piotr Dzionek
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com