Good morning folks,As a newbie to Ceph yesterday was the first time I've configured my CRUSH map, added a CRUSH rule and created my first pool using this rule.
Since then I get the status HEALTH_WARN with the following output: ~~~ $ sudo ceph status cluster: id: 47c108bd-db66-4197-96df-cadde9e9eb45 health: HEALTH_WARN Degraded data redundancy: 128 pgs undersized 1 pools have pg_num > pgp_num services: mon: 3 daemons, quorum ccp-tcnm01,ccp-tcnm02,ccp-tcnm03 mgr: ccp-tcnm01(active), standbys: ccp-tcnm03, ccp-tcnm02 osd: 3 osds: 3 up, 3 in data: pools: 1 pools, 128 pgs objects: 0 objects, 0 bytes usage: 3088 MB used, 3068 GB / 3071 GB avail pgs: 128 active+undersized ~~~The pool was created running `sudo ceph osd pool create joergsfirstpool 128 replicated replicate_datacenter`.
I've figured out that I forgot to set the value for the key pgp_num accordingly. So I've done that by running `sudo ceph osd pool set joergsfirstpool pgp_num 128`. As you could see in the following output 15 PGs were remapped but 113 still remain in active+undersized.
~~~ $ sudo ceph status cluster: id: 47c108bd-db66-4197-96df-cadde9e9eb45 health: HEALTH_WARN Degraded data redundancy: 113 pgs undersized services: mon: 3 daemons, quorum ccp-tcnm01,ccp-tcnm02,ccp-tcnm03 mgr: ccp-tcnm01(active), standbys: ccp-tcnm03, ccp-tcnm02 osd: 3 osds: 3 up, 3 in; 15 remapped pgs data: pools: 1 pools, 128 pgs objects: 0 objects, 0 bytes usage: 3089 MB used, 3068 GB / 3071 GB avail pgs: 113 active+undersized 15 active+clean+remapped ~~~ My questions are:1. What does active+undersized actually mean? I did not find anything about it in the documentation on docs.ceph.com.
2. Why are only 15 PGs were getting remapped after I've corrected the mistake with the wrong pgp_num value?
3. What's wrong here and what do I have to do to get the cluster back to active+clean, again?
For further information you could find my current CRUSH map below: # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable chooseleaf_vary_r 1 tunable chooseleaf_stable 1 tunable straw_calc_version 1 tunable allowed_bucket_algs 54 # devices device 0 osd.0 class hdd device 1 osd.1 class hdd device 2 osd.2 class hdd # types type 0 osd type 1 host type 2 chassis type 3 rack type 4 row type 5 pdu type 6 pod type 7 room type 8 datacenter type 9 region type 10 root # buckets host ccp-tcnm01 { id -5 # do not change unnecessarily id -6 class hdd # do not change unnecessarily # weight 1.000 alg straw2 hash 0 # rjenkins1 item osd.1 weight 1.000 } host ccp-tcnm03 { id -7 # do not change unnecessarily id -8 class hdd # do not change unnecessarily # weight 1.000 alg straw2 hash 0 # rjenkins1 item osd.2 weight 1.000 } datacenter dc1 { id -9 # do not change unnecessarily id -12 class hdd # do not change unnecessarily # weight 2.000 alg straw2 hash 0 # rjenkins1 item ccp-tcnm01 weight 1.000 item ccp-tcnm03 weight 1.000 } host ccp-tcnm02 { id -3 # do not change unnecessarily id -4 class hdd # do not change unnecessarily # weight 1.000 alg straw2 hash 0 # rjenkins1 item osd.0 weight 1.000 } datacenter dc3 { id -10 # do not change unnecessarily id -11 class hdd # do not change unnecessarily # weight 1.000 alg straw2 hash 0 # rjenkins1 item ccp-tcnm02 weight 1.000 } root default { id -1 # do not change unnecessarily id -2 class hdd # do not change unnecessarily # weight 3.000 alg straw2 hash 0 # rjenkins1 item dc1 weight 2.000 item dc3 weight 1.000 } # rules rule replicated_rule { id 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit } rule replicate_datacenter { id 1 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type datacenter step emit } # end crush map Best regards, Joerg
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com