Re: Erasure Code Setup

"Gruher, Joseph R" <joseph.r.gruher@xxxxxxxxx> · Mon, 24 Mar 2014 20:37:32 +0000

In the IRC chat dmick helped me confirm the commands of the form "ceph osd erasure-code-profile" are only in the master branch and are not in 0.78 (thanks dmick) so let me revise my query a bit:

1.      
Does it seem likely the problem described below is due to the failure domains and the solution is to change the failure domains to OSDs instead of the default of hosts?

2.      
If so, how would you make such a change for an erasure code pool / ruleset, in the 0.78 branch?

Thanks!

-Joe

From: Gruher, Joseph R 

Sent: Monday, March 24, 2014 1:01 PM

To: ceph-users@xxxxxxxxxxxxxx

Cc: Gruher, Joseph R

Subject: Erasure Code Setup

Hi Folks-

Having a bit of trouble with EC setup on 0.78.  Hoping someone can help me out.  I’ve got most of the pieces in place, I think I’m just having a problem with the ruleset.

I am running 0.78:
ceph --version
ceph version 0.78 (f6c746c314d7b87b8419b6e584c94bfe4511dbd4)

I created a new ruleset:
ceph osd crush rule create-erasure ecruleset

Then I created a new erasure code pool:
ceph osd pool create mycontainers_1 1800 1800 erasure crush_ruleset=ecruleset erasure-code-k=9 erasure-code-m=3

Pool exists:
ceph@joceph-admin01:/etc/ceph$ ceph osd dump
epoch 106
fsid b12ebb71-e4a6-41fa-8246-71cbfa09fb6e
created 2014-03-24 12:06:28.290970
modified 2014-03-24 12:42:59.231381
flags
pool 0 'data' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 84 owner 0 flags hashpspool crash_replay_interval 45 stripe_width 0
pool 1 'metadata' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 86 owner 0 flags hashpspool stripe_width 0
pool 2 'rbd' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 88 owner 0 flags hashpspool stripe_width 0
pool 4 'mycontainers_2' replicated size 2 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 1200 pgp_num 1200 last_change 100 owner 0 flags hashpspool stripe_width 0
pool 5 'mycontainers_3' replicated size 1 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 1800 pgp_num 1800 last_change 94 owner 0 flags hashpspool stripe_width 0
pool 6 'mycontainers_1' erasure size 12 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 1800 pgp_num 1800 last_change 104 owner 0 flags hashpspool stripe_width 4320

However, the new PGs won’t come to a healthy state:
ceph@joceph-admin01:/etc/ceph$ ceph status
    cluster b12ebb71-e4a6-41fa-8246-71cbfa09fb6e
     health HEALTH_WARN 1800 pgs incomplete; 1800 pgs stuck inactive; 1800 pgs stuck unclean
     monmap e1: 2 mons at {mohonpeak01=10.0.0.101:6789/0,mohonpeak02=10.0.0.102:6789/0}, election epoch 4, quorum 0,1 mohonpeak01,mohonpeak02
     osdmap e106: 18 osds: 18 up, 18 in
      pgmap v261: 5184 pgs, 7 pools, 0 bytes data, 0 objects
            682 MB used, 15082 GB / 15083 GB avail
                3384 active+clean
                1800 incomplete

I think this is because it is using a failure domain of hosts and I only have 2 hosts (with 9 OSDs on each for 18 OSDs total).  I suspect I need to change the ruleset to use a failure domain of OSD instead of host.  This is also mentioned
 on this page: https://ceph.com/docs/master/dev/erasure-coded-pool/.

However, the guidance on that that page to adjust it using commands of the form “ceph osd erasure-code-profile set myprofile” is not working for me.  As far as I can tell “ceph osd erasure-code-profile” does not seem to be a valid command
 syntax.  Is this documentation correct and up to date for 0.78?  Can anyone suggest where I am going wrong?  Thanks!

ceph@joceph-admin01:/etc/ceph$ ceph osd erasure-code-profile ls
no valid command found; 10 closest matches:
osd tier add-cache <poolname> <poolname> <int[0-]>
osd tier set-overlay <poolname> <poolname>
osd tier remove-overlay <poolname>
osd tier remove <poolname> <poolname>
osd tier cache-mode <poolname> none|writeback|forward|readonly
osd thrash <int[0-]>
osd tier add <poolname> <poolname> {--force-nonempty}
osd stat
osd reweight-by-utilization {<int[100-]>}
osd pool stats {<name>}
Error EINVAL: invalid command
ceph@joceph-admin01:/etc/ceph$

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com