I think your failure domain within your rules is wrong. step choose firstn 0 type osd Should be: step choose firstn 0 type host On 10/10/2017 5:05 PM, Konrad Riedel wrote: > Hello Ceph-users, > > after switching to luminous I was excited about the great > crush-device-class feature - now we have 5 servers with 1x2TB NVMe > based OSDs, 3 of them additionally with 4 HDDS per server. (we have > only three 400G NVMe disks for block.wal and block.db and therefore > can't distribute all HDDs evenly on all servers.) > > Output from "ceph pg dump" shows that some PGs end up on HDD OSDs on > the same > Host: > > ceph pg map 5.b > osdmap e12912 pg 5.b (5.b) -> up [9,7,8] acting [9,7,8] > > (on rebooting this host I had 4 stale PGs) > > I've written a small perl script to add hostname after OSD number and > got many PGs where > ceph placed 2 replicas on the same host... : > > 5.1e7: 8 - daniel 9 - daniel 11 - udo > 5.1eb: 10 - udo 7 - daniel 9 - daniel > 5.1ec: 10 - udo 11 - udo 7 - daniel > 5.1ed: 13 - felix 16 - felix 5 - udo > > > Is there any way I can correct this? > > > Please see crushmap below. Thanks for any help! > > # begin crush map > tunable choose_local_tries 0 > tunable choose_local_fallback_tries 0 > tunable choose_total_tries 50 > tunable chooseleaf_descend_once 1 > tunable chooseleaf_vary_r 1 > tunable chooseleaf_stable 1 > tunable straw_calc_version 1 > tunable allowed_bucket_algs 54 > > # devices > device 0 osd.0 class hdd > device 1 device1 > device 2 osd.2 class ssd > device 3 device3 > device 4 device4 > device 5 osd.5 class hdd > device 6 device6 > device 7 osd.7 class hdd > device 8 osd.8 class hdd > device 9 osd.9 class hdd > device 10 osd.10 class hdd > device 11 osd.11 class hdd > device 12 osd.12 class hdd > device 13 osd.13 class hdd > device 14 osd.14 class hdd > device 15 device15 > device 16 osd.16 class hdd > device 17 device17 > device 18 device18 > device 19 device19 > device 20 device20 > device 21 device21 > device 22 device22 > device 23 device23 > device 24 osd.24 class hdd > device 25 device25 > device 26 osd.26 class hdd > device 27 osd.27 class hdd > device 28 osd.28 class hdd > device 29 osd.29 class hdd > device 30 osd.30 class ssd > device 31 osd.31 class ssd > device 32 osd.32 class ssd > device 33 osd.33 class ssd > > # types > type 0 osd > type 1 host > type 2 rack > type 3 row > type 4 room > type 5 datacenter > type 6 root > > # buckets > host daniel { > id -4 # do not change unnecessarily > id -2 class hdd # do not change unnecessarily > id -9 class ssd # do not change unnecessarily > # weight 3.459 > alg straw2 > hash 0 # rjenkins1 > item osd.31 weight 1.819 > item osd.7 weight 0.547 > item osd.8 weight 0.547 > item osd.9 weight 0.547 > } > host felix { > id -5 # do not change unnecessarily > id -3 class hdd # do not change unnecessarily > id -10 class ssd # do not change unnecessarily > # weight 3.653 > alg straw2 > hash 0 # rjenkins1 > item osd.33 weight 1.819 > item osd.13 weight 0.547 > item osd.14 weight 0.467 > item osd.16 weight 0.547 > item osd.0 weight 0.274 > } > host udo { > id -6 # do not change unnecessarily > id -7 class hdd # do not change unnecessarily > id -11 class ssd # do not change unnecessarily > # weight 4.006 > alg straw2 > hash 0 # rjenkins1 > item osd.32 weight 1.819 > item osd.5 weight 0.547 > item osd.10 weight 0.547 > item osd.11 weight 0.547 > item osd.12 weight 0.547 > } > host moritz { > id -13 # do not change unnecessarily > id -14 class hdd # do not change unnecessarily > id -15 class ssd # do not change unnecessarily > # weight 1.819 > alg straw2 > hash 0 # rjenkins1 > item osd.30 weight 1.819 > } > host bruno { > id -16 # do not change unnecessarily > id -17 class hdd # do not change unnecessarily > id -18 class ssd # do not change unnecessarily > # weight 3.183 > alg straw2 > hash 0 # rjenkins1 > item osd.24 weight 0.273 > item osd.26 weight 0.273 > item osd.27 weight 0.273 > item osd.28 weight 0.273 > item osd.29 weight 0.273 > item osd.2 weight 1.819 > } > root default { > id -1 # do not change unnecessarily > id -8 class hdd # do not change unnecessarily > id -12 class ssd # do not change unnecessarily > # weight 16.121 > alg straw2 > hash 0 # rjenkins1 > item daniel weight 3.459 > item felix weight 3.653 > item udo weight 4.006 > item moritz weight 1.819 > item bruno weight 3.183 > } > > # rules > rule ssd { > id 0 > type replicated > min_size 1 > max_size 10 > step take default class ssd > step choose firstn 0 type osd > step emit > } > rule hdd { > id 1 > type replicated > min_size 1 > max_size 10 > step take default class hdd > step choose firstn 0 type osd > step emit > } > > # end crush map > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com