In the IRC chat dmick helped me confirm the commands of the form "ceph osd erasure-code-profile" are only in the master branch and are not in 0.78 (thanks dmick) so let me revise my query a bit: 1.
Does it seem likely the problem described below is due to the failure domains and the solution is to change the failure domains to OSDs instead of the default of hosts? 2.
If so, how would you make such a change for an erasure code pool / ruleset, in the 0.78 branch? Thanks! -Joe From: Gruher, Joseph R Hi Folks- Having a bit of trouble with EC setup on 0.78. Hoping someone can help me out. I’ve got most of the pieces in place, I think I’m just having a problem with the ruleset. I am running 0.78: ceph --version ceph version 0.78 (f6c746c314d7b87b8419b6e584c94bfe4511dbd4) I created a new ruleset: ceph osd crush rule create-erasure ecruleset Then I created a new erasure code pool: ceph osd pool create mycontainers_1 1800 1800 erasure crush_ruleset=ecruleset erasure-code-k=9 erasure-code-m=3 Pool exists: ceph@joceph-admin01:/etc/ceph$ ceph osd dump epoch 106 fsid b12ebb71-e4a6-41fa-8246-71cbfa09fb6e created 2014-03-24 12:06:28.290970 modified 2014-03-24 12:42:59.231381 flags pool 0 'data' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 84 owner 0 flags hashpspool crash_replay_interval 45 stripe_width 0 pool 1 'metadata' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 86 owner 0 flags hashpspool stripe_width 0 pool 2 'rbd' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 88 owner 0 flags hashpspool stripe_width 0 pool 4 'mycontainers_2' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 1200 pgp_num 1200 last_change 100 owner 0 flags hashpspool stripe_width 0 pool 5 'mycontainers_3' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1800 pgp_num 1800 last_change 94 owner 0 flags hashpspool stripe_width 0 pool 6 'mycontainers_1' erasure size 12 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 1800 pgp_num 1800 last_change 104 owner 0 flags hashpspool stripe_width 4320 However, the new PGs won’t come to a healthy state: ceph@joceph-admin01:/etc/ceph$ ceph status cluster b12ebb71-e4a6-41fa-8246-71cbfa09fb6e health HEALTH_WARN 1800 pgs incomplete; 1800 pgs stuck inactive; 1800 pgs stuck unclean monmap e1: 2 mons at {mohonpeak01=10.0.0.101:6789/0,mohonpeak02=10.0.0.102:6789/0}, election epoch 4, quorum 0,1 mohonpeak01,mohonpeak02 osdmap e106: 18 osds: 18 up, 18 in pgmap v261: 5184 pgs, 7 pools, 0 bytes data, 0 objects 682 MB used, 15082 GB / 15083 GB avail 3384 active+clean 1800 incomplete I think this is because it is using a failure domain of hosts and I only have 2 hosts (with 9 OSDs on each for 18 OSDs total). I suspect I need to change the ruleset to use a failure domain of OSD instead of host. This is also mentioned
on this page: https://ceph.com/docs/master/dev/erasure-coded-pool/. However, the guidance on that that page to adjust it using commands of the form “ceph osd erasure-code-profile set myprofile” is not working for me. As far as I can tell “ceph osd erasure-code-profile” does not seem to be a valid command
syntax. Is this documentation correct and up to date for 0.78? Can anyone suggest where I am going wrong? Thanks! ceph@joceph-admin01:/etc/ceph$ ceph osd erasure-code-profile ls no valid command found; 10 closest matches: osd tier add-cache <poolname> <poolname> <int[0-]> osd tier set-overlay <poolname> <poolname> osd tier remove-overlay <poolname> osd tier remove <poolname> <poolname> osd tier cache-mode <poolname> none|writeback|forward|readonly osd thrash <int[0-]> osd tier add <poolname> <poolname> {--force-nonempty} osd stat osd reweight-by-utilization {<int[100-]>} osd pool stats {<name>} Error EINVAL: invalid command ceph@joceph-admin01:/etc/ceph$ |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com