Hi! Can you help me to understand why crushmap with step chooseleaf firstn 0 type host can't work with hosts in data centers? If I have the osd tree: # id weight type name up/down reweight -1 0.12 root default -3 0.03 host tceph2 1 0.03 osd.1 up 1 -4 0.03 host tceph3 2 0.03 osd.2 up 1 -2 0.03 host tceph1 0 0.03 osd.0 up 1 -5 0.03 host tceph4 3 0.03 osd.3 up 1 -7 0 datacenter dc1 -6 0 datacenter dc2 and default crush map rule { "rule_id": 0, "rule_name": "data", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1}, { "op": "chooseleaf_firstn", "num": 0, "type": "host"}, { "op": "emit"}]}, used by pools: pool 0 'data' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1176 owner 0 crash_replay_interval 45 pool 1 'metadata' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1190 owner 0 pool 2 'rbd' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1182 owner 0 When one of the osd is down, cluster successfully rebalance to OK state: # id weight type name up/down reweight -1 0.12 root default -3 0.03 host tceph2 1 0.03 osd.1 down 0 -4 0.03 host tceph3 2 0.03 osd.2 up 1 -2 0.03 host tceph1 0 0.03 osd.0 up 1 -5 0.03 host tceph4 3 0.03 osd.3 up 1 -7 0 datacenter dc1 -6 0 datacenter dc2 ceph -s cluster 6bdb23fb-adac-4113-8c75-e6bd245fcfd6 health HEALTH_OK monmap e1: 1 mons at {tceph1=10.166.10.95:6789/0}, election epoch 1, quorum 0 tceph1 osdmap e1207: 4 osds: 3 up, 3 in pgmap v4114539: 480 pgs: 480 active+clean; 2628 MB data, 5840 MB used, 89341 MB / 95182 MB avail mdsmap e1: 0/0/1 up But if hosts moved to datacenters like in this map: # id weight type name up/down reweight -1 0.12 root default -7 0.06 datacenter dc1 -4 0.03 host tceph3 2 0.03 osd.2 up 1 -5 0.03 host tceph4 3 0.03 osd.3 up 1 -6 0.06 datacenter dc2 -2 0.03 host tceph1 0 0.03 osd.0 down 0 -3 0.03 host tceph2 1 0.03 osd.1 up 1 cluster can't reach OK state when one host is out of cluster: cluster 6bdb23fb-adac-4113-8c75-e6bd245fcfd6 health HEALTH_WARN 6 pgs incomplete; 6 pgs stuck inactive; 6 pgs stuck unclean monmap e1: 1 mons at {tceph1=10.166.10.95:6789/0}, election epoch 1, quorum 0 tceph1 osdmap e1256: 4 osds: 3 up, 3 in pgmap v4114707: 480 pgs: 474 active+clean, 6 incomplete; 2516 MB data, 5606 MB used, 89575 MB / 95182 MB avail mdsmap e1: 0/0/1 up if downed host is up and return to the cluster then health is OK. If downed osd manually reweighed to 0 then cluster health is OK. Crushmap with step chooseleaf firstn 0 type datacenter have the same issue. { "rule_id": 3, "rule_name": "datacenter_rule", "ruleset": 3, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -8}, { "op": "chooseleaf_firstn", "num": 0, "type": "datacenter"}, { "op": "emit"}]}, -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140520/efa4aede/attachment.htm>