On Mon, Nov 28, 2016 at 9:54 PM, Piotr Dzionek <piotr.dzionek@xxxxxxxx> wrote: > Hi, > I recently installed 3 nodes ceph cluster v.10.2.3. It has 3 mons, and 12 > osds. I removed default pool and created the following one: > > pool 7 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash > rjenkins pg_num 1024 pgp_num 1024 last_change 126 flags hashpspool > stripe_width 0 Do you understand the significance of min_size 1? Are you OK with the likelihood of data loss that this value introduces? > > Cluster is healthy if all osds are up, however if I stop any of the osds, it > becomes stuck and undersized - it is not rebuilding. > > cluster ***** > health HEALTH_WARN > 166 pgs degraded > 108 pgs stuck unclean > 166 pgs undersized > recovery 67261/827220 objects degraded (8.131%) > 1/12 in osds are down > monmap e3: 3 mons at > {**osd01=***.144:6789/0,***osd02=***.145:6789/0,**osd03=*****.146:6789/0} > election epoch 14, quorum 0,1,2 **osd01,**osd02,**osd03 > osdmap e161: 12 osds: 11 up, 12 in; 166 remapped pgs > flags sortbitwise > pgmap v307710: 1024 pgs, 1 pools, 1230 GB data, 403 kobjects > 2452 GB used, 42231 GB / 44684 GB avail > 67261/827220 objects degraded (8.131%) > 858 active+clean > 166 active+undersized+degraded > > Replica size is 2 and and I use the following crushmap: > > # begin crush map > tunable choose_local_tries 0 > tunable choose_local_fallback_tries 0 > tunable choose_total_tries 50 > tunable chooseleaf_descend_once 1 > tunable chooseleaf_vary_r 1 > tunable straw_calc_version 1 > > # devices > device 0 osd.0 > device 1 osd.1 > device 2 osd.2 > device 3 osd.3 > device 4 osd.4 > device 5 osd.5 > device 6 osd.6 > device 7 osd.7 > device 8 osd.8 > device 9 osd.9 > device 10 osd.10 > device 11 osd.11 > > # types > type 0 osd > type 1 host > type 2 chassis > type 3 rack > type 4 row > type 5 pdu > type 6 pod > type 7 room > type 8 datacenter > type 9 region > type 10 root > > # buckets > host osd01 { > id -2 # do not change unnecessarily > # weight 14.546 > alg straw > hash 0 # rjenkins1 > item osd.0 weight 3.636 > item osd.1 weight 3.636 > item osd.2 weight 3.636 > item osd.3 weight 3.636 > } > host osd02 { > id -3 # do not change unnecessarily > # weight 14.546 > alg straw > hash 0 # rjenkins1 > item osd.4 weight 3.636 > item osd.5 weight 3.636 > item osd.6 weight 3.636 > item osd.7 weight 3.636 > } > host osd03 { > id -4 # do not change unnecessarily > # weight 14.546 > alg straw > hash 0 # rjenkins1 > item osd.8 weight 3.636 > item osd.9 weight 3.636 > item osd.10 weight 3.636 > item osd.11 weight 3.636 > } > root default { > id -1 # do not change unnecessarily > # weight 43.637 > alg straw > hash 0 # rjenkins1 > item osd01 weight 14.546 > item osd02 weight 14.546 > item osd03 weight 14.546 > } > > # rules > rule replicated_ruleset { > ruleset 0 > type replicated > min_size 1 > max_size 10 > step take default > step chooseleaf firstn 0 type host > step emit > } > > # end crush map > > I am not sure what is the reason for undersized state. All osd disks are the > same size and replica size is 2. Also data is only replicated per hosts > basis and I have 3 separate hosts. Maybe number of pg is incorrect ? Is > 1024 too big ? or maybe there is some misconfiguration in crushmap ? > > > Kind regards, > Piotr Dzionek > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Cheers, Brad _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com