Hello, I have created a small 16pg EC pool with k=4, m=2. Then I applied following crush rule to it: rule test_ec { id 99 type erasure min_size 5 max_size 6 step set_chooseleaf_tries 5 step set_choose_tries 100 step take default step choose indep 3 type host step chooseleaf indep 2 type osd step emit } The OSD tree looks as following: -1 43.38448 root default -9 43.38448 region lab1 -7 43.38448 room dc1.lab1 -5 43.38448 rack r1.dc1.lab1 -3 14.44896 host host1.r1.dc1.lab1 6 hdd 3.63689 osd.6 up 1.00000 1.00000 8 hdd 3.63689 osd.8 up 1.00000 1.00000 7 hdd 3.63689 osd.7 up 1.00000 1.00000 11 hdd 3.53830 osd.11 up 1.00000 1.00000 -11 14.44896 host host2.r1.dc1.lab1 4 hdd 3.63689 osd.4 up 1.00000 1.00000 9 hdd 3.63689 osd.9 up 1.00000 1.00000 5 hdd 3.63689 osd.5 up 1.00000 1.00000 10 hdd 3.53830 osd.10 up 1.00000 1.00000 -13 14.48656 host host3.r1.dc1.lab1 0 hdd 3.57590 osd.0 up 1.00000 1.00000 1 hdd 3.63689 osd.1 up 1.00000 1.00000 2 hdd 3.63689 osd.2 up 1.00000 1.00000 3 hdd 3.63689 osd.3 up 1.00000 1.00000 My expectation was that each host will contain 2 shards of any PG of the pool. When I dumped PGs, it was true, but one group is placed on OSDs 0,2,3 which will cause downtime in case of host3 failure. root@host1:~/mkw # ceph pg dump|grep "^66\."|awk '{print $17}' dumped all [4,5,7,6,1,2] [8,11,9,3,0,2] <<< - this one is problematic [6,7,10,9,2,0] [2,3,7,6,5,9] [7,8,10,5,3,1] [4,5,8,6,0,2] [7,11,9,4,1,2] [5,9,0,2,7,11] [9,5,3,1,7,8] [8,11,2,0,5,9] [2,0,8,6,10,9] [3,2,5,9,7,11] [6,7,9,5,1,2] [10,5,1,3,11,8] [4,5,7,8,2,0] [7,8,3,2,9,10] Is there a way to ensure that host failure is not disruptive to the cluster? During the experiment I used info from this thread: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-October/030227.html Kind regards, Maks Kowalik _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx