I create the new complex rule: rule datacenter_rep2 { ruleset 2 type replicated min_size 1 max_size 10 step take default step choose firstn 2 type datacenter step chooseleaf firstn -1 type host step emit } assign to pools, and now cluster work as I expect. 2014-05-20 11:59 GMT+12:00 Vladislav Gorbunov <vadikgo at gmail.com>: > Hi! > > Can you help me to understand why crushmap with > step chooseleaf firstn 0 type host > can't work with hosts in data centers? > > If I have the osd tree: > # id weight type name up/down reweight > -1 0.12 root default > -3 0.03 host tceph2 > 1 0.03 osd.1 up 1 > -4 0.03 host tceph3 > 2 0.03 osd.2 up 1 > -2 0.03 host tceph1 > 0 0.03 osd.0 up 1 > -5 0.03 host tceph4 > 3 0.03 osd.3 up 1 > -7 0 datacenter dc1 > -6 0 datacenter dc2 > > and default crush map rule > > { "rule_id": 0, > "rule_name": "data", > "ruleset": 0, > "type": 1, > "min_size": 1, > "max_size": 10, > "steps": [ > { "op": "take", > "item": -1}, > { "op": "chooseleaf_firstn", > "num": 0, > "type": "host"}, > { "op": "emit"}]}, > > > used by pools: > pool 0 'data' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins > pg_num 64 pgp_num 64 last_change 1176 owner 0 crash_replay_interval 45 > pool 1 'metadata' rep size 2 min_size 1 crush_ruleset 0 object_hash > rjenkins pg_num 64 pgp_num 64 last_change 1190 owner 0 > pool 2 'rbd' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins > pg_num 64 pgp_num 64 last_change 1182 owner 0 > > When one of the osd is down, cluster successfully rebalance to OK state: > > # id weight type name up/down reweight > -1 0.12 root default > -3 0.03 host tceph2 > 1 0.03 osd.1 down 0 > -4 0.03 host tceph3 > 2 0.03 osd.2 up 1 > -2 0.03 host tceph1 > 0 0.03 osd.0 up 1 > -5 0.03 host tceph4 > 3 0.03 osd.3 up 1 > -7 0 datacenter dc1 > -6 0 datacenter dc2 > > ceph -s > cluster 6bdb23fb-adac-4113-8c75-e6bd245fcfd6 > health HEALTH_OK > monmap e1: 1 mons at {tceph1=10.166.10.95:6789/0}, election epoch 1, > quorum 0 tceph1 > osdmap e1207: 4 osds: 3 up, 3 in > pgmap v4114539: 480 pgs: 480 active+clean; 2628 MB data, 5840 MB used, > 89341 MB / 95182 MB avail > mdsmap e1: 0/0/1 up > > But if hosts moved to datacenters like in this map: > # id weight type name up/down reweight > -1 0.12 root default > -7 0.06 datacenter dc1 > -4 0.03 host tceph3 > 2 0.03 osd.2 up 1 > -5 0.03 host tceph4 > 3 0.03 osd.3 up 1 > -6 0.06 datacenter dc2 > -2 0.03 host tceph1 > 0 0.03 osd.0 down 0 > -3 0.03 host tceph2 > 1 0.03 osd.1 up 1 > > cluster can't reach OK state when one host is out of cluster: > > cluster 6bdb23fb-adac-4113-8c75-e6bd245fcfd6 > health HEALTH_WARN 6 pgs incomplete; 6 pgs stuck inactive; 6 pgs stuck > unclean > monmap e1: 1 mons at {tceph1=10.166.10.95:6789/0}, election epoch 1, > quorum 0 tceph1 > osdmap e1256: 4 osds: 3 up, 3 in > pgmap v4114707: 480 pgs: 474 active+clean, 6 incomplete; 2516 MB data, > 5606 MB used, 89575 MB / 95182 MB avail > mdsmap e1: 0/0/1 up > > if downed host is up and return to the cluster then health is OK. If > downed osd manually reweighed to 0 then cluster health is OK. > > Crushmap with > step chooseleaf firstn 0 type datacenter > have the same issue. > > { "rule_id": 3, > "rule_name": "datacenter_rule", > "ruleset": 3, > "type": 1, > "min_size": 1, > "max_size": 10, > "steps": [ > { "op": "take", > "item": -8}, > { "op": "chooseleaf_firstn", > "num": 0, > "type": "datacenter"}, > { "op": "emit"}]}, > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140520/6c895e92/attachment.htm>