Erasure Code Setup

"Gruher, Joseph R" <joseph.r.gruher@xxxxxxxxx> · Mon, 24 Mar 2014 20:01:21 +0000

Hi Folks-

Having a bit of trouble with EC setup on 0.78.  Hoping someone can help me out.  I’ve got most of the pieces in place, I think I’m just having a problem with the ruleset.

I am running 0.78:
ceph --version
ceph version 0.78 (f6c746c314d7b87b8419b6e584c94bfe4511dbd4)

I created a new ruleset:
ceph osd crush rule create-erasure ecruleset

Then I created a new erasure code pool:
ceph osd pool create mycontainers_1 1800 1800 erasure crush_ruleset=ecruleset erasure-code-k=9 erasure-code-m=3

Pool exists:
ceph@joceph-admin01:/etc/ceph$ ceph osd dump
epoch 106
fsid b12ebb71-e4a6-41fa-8246-71cbfa09fb6e
created 2014-03-24 12:06:28.290970
modified 2014-03-24 12:42:59.231381
flags
pool 0 'data' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 84 owner 0 flags hashpspool crash_replay_interval 45 stripe_width 0
pool 1 'metadata' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 86 owner 0 flags hashpspool stripe_width 0
pool 2 'rbd' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 88 owner 0 flags hashpspool stripe_width 0
pool 4 'mycontainers_2' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 1200 pgp_num 1200 last_change 100 owner 0 flags hashpspool stripe_width 0
pool 5 'mycontainers_3' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1800 pgp_num 1800 last_change 94 owner 0 flags hashpspool stripe_width 0
pool 6 'mycontainers_1' erasure size 12 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 1800 pgp_num 1800 last_change 104 owner 0 flags hashpspool stripe_width 4320

However, the new PGs won’t come to a healthy state:
ceph@joceph-admin01:/etc/ceph$ ceph status
    cluster b12ebb71-e4a6-41fa-8246-71cbfa09fb6e
     health HEALTH_WARN 1800 pgs incomplete; 1800 pgs stuck inactive; 1800 pgs stuck unclean
     monmap e1: 2 mons at {mohonpeak01=10.0.0.101:6789/0,mohonpeak02=10.0.0.102:6789/0}, election epoch 4, quorum 0,1 mohonpeak01,mohonpeak02
     osdmap e106: 18 osds: 18 up, 18 in
      pgmap v261: 5184 pgs, 7 pools, 0 bytes data, 0 objects
            682 MB used, 15082 GB / 15083 GB avail
                3384 active+clean
                1800 incomplete

I think this is because it is using a failure domain of hosts and I only have 2 hosts (with 9 OSDs on each for 18 OSDs total).  I suspect I need to change the ruleset to use a failure domain of OSD instead of host.  This is also mentioned
 on this page: https://ceph.com/docs/master/dev/erasure-coded-pool/.

However, the guidance on that that page to adjust it using commands of the form “ceph osd erasure-code-profile set myprofile” is not working for me.  As far as I can tell “ceph osd erasure-code-profile” does not seem to be a valid command
 syntax.  Is this documentation correct and up to date for 0.78?  Can anyone suggest where I am going wrong?  Thanks!

ceph@joceph-admin01:/etc/ceph$ ceph osd erasure-code-profile ls
no valid command found; 10 closest matches:
osd tier add-cache <poolname> <poolname> <int[0-]>
osd tier set-overlay <poolname> <poolname>
osd tier remove-overlay <poolname>
osd tier remove <poolname> <poolname>
osd tier cache-mode <poolname> none|writeback|forward|readonly
osd thrash <int[0-]>
osd tier add <poolname> <poolname> {--force-nonempty}
osd stat
osd reweight-by-utilization {<int[100-]>}
osd pool stats {<name>}
Error EINVAL: invalid command
ceph@joceph-admin01:/etc/ceph$

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com