Dear All We created an erasure coded pool with k=4 m=2 with failure-domain=host but have only 6 osd nodes. Is that correct that recovery will be forbidden by the crush rule if a node is down? After rebooting all nodes we noticed that the recovery was slow, maybe half an hour, but all pools are currently empty (new install). This is odd... Can it be related to the k+m being equal to the number of nodes? (4+2=6) step set_choose_tries 100 was already in the EC crush rule. rule ewos1-prod_cinder_ec { id 2 type erasure min_size 3 max_size 6 step set_chooseleaf_tries 5 step set_choose_tries 100 step take default class nvme step chooseleaf indep 0 type host step emit } ceph osd erasure-code-profile set ec42 k=4 m=2 crush-root=default crush-failure-domain=host crush-device-class=nvme ceph osd pool create ewos1-prod_cinder_ec 256 256 erasure ec42 ceph version 12.2.10-543-gfc6f0c7299 (fc6f0c7299e3442e8a0ab83260849a6249ce7b5f) luminous (stable) cluster: id: b5e30221-a214-353c-b66b-8c37b4349123 health: HEALTH_WARN noout flag(s) set Reduced data availability: 125 pgs inactive, 32 pgs peering services: mon: 3 daemons, quorum ewos1-osd1-prod,ewos1-osd3-prod,ewos1-osd5-prod mgr: ewos1-osd5-prod(active), standbys: ewos1-osd3-prod, ewos1-osd1-prod osd: 24 osds: 24 up, 24 in flags noout data: pools: 4 pools, 1600 pgs objects: 0 objects, 0B usage: 24.3GiB used, 43.6TiB / 43.7TiB avail pgs: 7.812% pgs not active 1475 active+clean 93 activating 32 peering Which k&m values are preferred on 6 nodes? BTW, we plan to use this EC pool as a second rbd pool in Openstack, with the main first rbd pool being replicated size=3; it is nvme ssd only. Thanks for your help! Best Regards Francois Scheurer
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com