Re: Erasure coded PGs incomplete

Nick Fisk <nick@xxxxxxxxxx> · Fri, 9 Jan 2015 07:45:06 -0000

Hi Italo,

If you check for a post from me from a couple of days back, I have done exactly this.

I created a k=5 m=3 over 4 hosts. This ensured that I could lose a whole host and then an OSD on another host and the cluster was still fully operational.

I’m not sure if my method I used in the Crush map was the best way to achieve what I did, but it seemed to work.

Nick

From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Italo Santos
Sent: 08 January 2015 22:35
To: Loic Dachary
Cc: ceph-users@xxxxxxxxxxxxxx
Subject: Re:  Erasure coded PGs incomplete

Thanks for your answer. But another doubt raised…

Suppose I have 4 hosts with a erasure pool created with k=3, m=1 and failure domain by host and I lost a host. On this case I’ll face with the same issue on the beginning of this thread because k+m > number of hosts, right?

- On this scenario, with one host less I still able to read and write data on cluster?
- To solve the issue I’ll need add another host on cluster?

Regards.

Italo Santos
http://italosantos.com.br/

On Wednesday, December 17, 2014 at 20:19, Loic Dachary wrote:

On 17/12/2014 19:46, Italo Santos wrote:> Understood.
Thanks for your help, the cluster is healthy now :D

Also, using for example k=6,m=1 and failure domain by host I’ll be able lose all OSD on the same host, but if a lose 2 disks on different hosts I can lose data right? So, it is possible been a failure domain which allow me to lose an OSD or a host?

That's actually a good way to put it :-)

Regards.

*Italo Santos*
http://xo4t.mjt.lu/link/xo4t/g1qvzm1/2/dn1C026L2XZ70Qgdx4rbxg/aHR0cDovL2l0YWxvc2FudG9zLmNvbS5ici8/

On Wednesday, December 17, 2014 at 4:27 PM, Loic Dachary wrote:

On 17/12/2014 19:22, Italo Santos wrote:
Loic,

So, if want have a failure domain by host, I’ll need set up a erasure profile which k+m = total number of hosts I have, right?

Yes, k+m has to be <= number of hosts.

Regards.

*Italo Santos*
http://xo4t.mjt.lu/link/xo4t/g1qvzm1/3/YUHUuVSer43vrEy8iH_2xg/aHR0cDovL2l0YWxvc2FudG9zLmNvbS5ici8/

On Wednesday, December 17, 2014 at 3:24 PM, Loic Dachary wrote:

On 17/12/2014 18:18, Italo Santos wrote:
Hello,

I’ve take a look to this documentation (which help a lot) and if I understand right, when I set a profile like:

===
ceph osd erasure-code-profile set isilon k=8 m=2 ruleset-failure-domain=host
===

And create a pool following the recommendations on doc, I’ll need (100*16)/2 = 800 PGs, I’ll need the sufficient number of hosts to support create total PGs?

You will need k+m = 10 host per OSD. If you only have 10 hosts that should be ok and the 800 PGs will use these 10 OSD in various orders. It also means that you will end up having 800 PG per OSD which is a bit too mche. If you have 20 OSDs that will be better : each PG will get 10 OSD out of 20 and each OSD will have 400 PGs. Ideally you want the number of PG per OSD to be in the range (approximately) [20,300].

Cheers

Regards.

*Italo Santos*
http://xo4t.mjt.lu/link/xo4t/g1qvzm1/4/PnGL1VIjqQdlP9tC6iNMoQ/aHR0cDovL2l0YWxvc2FudG9zLmNvbS5ici8/

On Wednesday, December 17, 2014 at 2:42 PM, Loic Dachary wrote:

Hi,

Thanks for the update : good news are much appreciated :-) Would you have time to review the documentation at http://xo4t.mjt.lu/link/xo4t/g1qvzm1/5/S4Pl_kkMA_vhvRa2hSaEfg/aHR0cHM6Ly9naXRodWIuY29tL2NlcGgvY2VwaC9wdWxsLzMxOTQvZmlsZXM ? It was partly motivated by the problem you had.

Cheers

On 17/12/2014 14:03, Italo Santos wrote:
Hello Loic,

Thanks for you help, I’ve take a look to my crush map and I replace "step chooseleaf indep 0 type osd” by "step choose indep 0 type osd” and all PGs was created successfully.

At.

*Italo Santos*
http://xo4t.mjt.lu/link/xo4t/g1qvzm1/6/iWxoi-XK5bsyzKY57hLD7Q/aHR0cDovL2l0YWxvc2FudG9zLmNvbS5ici8/

On Tuesday, December 16, 2014 at 8:39 PM, Loic Dachary wrote:

Hi,

The 2147483647 means that CRUSH did not find enough OSD for a given PG. If you check the crush rule associated with the erasure coded pool, you will most probably find why.

Cheers

On 16/12/2014 23:32, Italo Santos wrote:
Hello,

I'm trying to create an erasure pool following http://xo4t.mjt.lu/link/xo4t/g1qvzm1/7/tQ_8b1atAAQNfNzi06ReHg/aHR0cDovL2RvY3MuY2VwaC5jb20vZG9jcy9tYXN0ZXIvcmFkb3Mvb3BlcmF0aW9ucy9lcmFzdXJlLWNvZGUv, but when I try create a pool with a specifc erasure-code-profile ("myprofile") the PGs became on incomplete state.

Anyone can help me?

Below the profile I created:
root@ceph0001:~# ceph osd erasure-code-profile get myprofile
directory=/usr/lib/ceph/erasure-code
k=6
m=2
plugin=jerasure
technique=reed_sol_van

The status of cluster:
root@ceph0001:~# ceph health
HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck unclean

health detail:
root@ceph0001:~# ceph health detail
HEALTH_WARN 12 pgs incomplete; 12 pgs stuck inactive; 12 pgs stuck unclean
pg 2.9 is stuck inactive since forever, current state incomplete, last acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
pg 2.8 is stuck inactive since forever, current state incomplete, last acting [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
pg 2.b is stuck inactive since forever, current state incomplete, last acting [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
pg 2.a is stuck inactive since forever, current state incomplete, last acting [11,7,2,2147483647,2147483647,2147483647,15,2147483647]
pg 2.5 is stuck inactive since forever, current state incomplete, last acting [12,8,5,1,2147483647,2147483647,2147483647,2147483647]
pg 2.4 is stuck inactive since forever, current state incomplete, last acting [5,2147483647,13,1,2147483647,2147483647,8,2147483647]
pg 2.7 is stuck inactive since forever, current state incomplete, last acting [12,2,10,7,2147483647,2147483647,2147483647,2147483647]
pg 2.6 is stuck inactive since forever, current state incomplete, last acting [9,15,2147483647,4,2,2147483647,2147483647,2147483647]
pg 2.1 is stuck inactive since forever, current state incomplete, last acting [2,4,2147483647,13,2147483647,10,2147483647,2147483647]
pg 2.0 is stuck inactive since forever, current state incomplete, last acting [14,1,2147483647,4,10,2147483647,2147483647,2147483647]
pg 2.3 is stuck inactive since forever, current state incomplete, last acting [14,11,6,2147483647,2147483647,2147483647,2,2147483647]
pg 2.2 is stuck inactive since forever, current state incomplete, last acting [13,5,11,2147483647,2147483647,3,2147483647,2147483647]
pg 2.9 is stuck unclean since forever, current state incomplete, last acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647]
pg 2.8 is stuck unclean since forever, current state incomplete, last acting [0,2147483647,4,2147483647,10,2147483647,15,2147483647]
pg 2.b is stuck unclean since forever, current state incomplete, last acting [8,3,14,2147483647,5,2147483647,2147483647,2147483647]
pg 2.a is stuck unclean since forever, current state incomplete, last acting [11,7,2,2147483647,2147483647,2147483647,15,2147483647]
pg 2.5 is stuck unclean since forever, current state incomplete, last acting [12,8,5,1,2147483647,2147483647,2147483647,2147483647]
pg 2.4 is stuck unclean since forever, current state incomplete, last acting [5,2147483647,13,1,2147483647,2147483647,8,2147483647]
pg 2.7 is stuck unclean since forever, current state incomplete, last acting [12,2,10,7,2147483647,2147483647,2147483647,2147483647]
pg 2.6 is stuck unclean since forever, current state incomplete, last acting [9,15,2147483647,4,2,2147483647,2147483647,2147483647]
pg 2.1 is stuck unclean since forever, current state incomplete, last acting [2,4,2147483647,13,2147483647,10,2147483647,2147483647]
pg 2.0 is stuck unclean since forever, current state incomplete, last acting [14,1,2147483647,4,10,2147483647,2147483647,2147483647]
pg 2.3 is stuck unclean since forever, current state incomplete, last acting [14,11,6,2147483647,2147483647,2147483647,2,2147483647]
pg 2.2 is stuck unclean since forever, current state incomplete, last acting [13,5,11,2147483647,2147483647,3,2147483647,2147483647]
pg 2.9 is incomplete, acting [4,10,15,2147483647,3,2147483647,2147483647,2147483647] (reducing pool ecpool min_size from 6 may help; search ceph.com/docs <http://xo4t.mjt.lu/link/xo4t/g1qvzm1/8/T2rcyyNBPmyHRBbPIdKW6g/aHR0cDovL2NlcGguY29tL2RvY3M> for 'incomplete')
pg 2.8 is incomplete, acting [0,2147483647,4,2147483647,10,2147483647,15,2147483647] (reducing pool ecpool min_size from 6 may help; search ceph.com/docs <http://xo4t.mjt.lu/link/xo4t/g1qvzm1/9/6SzkDkkkyo8DFx4GSBkQmg/aHR0cDovL2NlcGguY29tL2RvY3M> for 'incomplete')
pg 2.b is incomplete, acting [8,3,14,2147483647,5,2147483647,2147483647,2147483647] (reducing pool ecpool min_size from 6 may help; search ceph.com/docs <http://xo4t.mjt.lu/link/xo4t/g1qvzm1/10/tFWB2O1ICh0D-nFPlS0VFg/aHR0cDovL2NlcGguY29tL2RvY3M> for 'incomplete')
pg 2.a is incomplete, acting [11,7,2,2147483647,2147483647,2147483647,15,2147483647] (reducing pool ecpool min_size from 6 may help; search ceph.com/docs <http://xo4t.mjt.lu/link/xo4t/g1qvzm1/11/xGPPJQQ-YahtrsJ_b3jBkw/aHR0cDovL2NlcGguY29tL2RvY3M> for 'incomplete')
pg 2.5 is incomplete, acting [12,8,5,1,2147483647,2147483647,2147483647,2147483647] (reducing pool ecpool min_size from 6 may help; search ceph.com/docs <http://xo4t.mjt.lu/link/xo4t/g1qvzm1/12/L178Ue-tiDKkbJzV27k-BQ/aHR0cDovL2NlcGguY29tL2RvY3M> for 'incomplete')
pg 2.4 is incomplete, acting [5,2147483647,13,1,2147483647,2147483647,8,2147483647] (reducing pool ecpool min_size from 6 may help; search ceph.com/docs <http://xo4t.mjt.lu/link/xo4t/g1qvzm1/13/IfWisyNbC7hz3lu3GibDnw/aHR0cDovL2NlcGguY29tL2RvY3M> for 'incomplete')
pg 2.7 is incomplete, acting [12,2,10,7,2147483647,2147483647,2147483647,2147483647] (reducing pool ecpool min_size from 6 may help; search ceph.com/docs <http://xo4t.mjt.lu/link/xo4t/g1qvzm1/14/I_EfUDBCFpeHxWHpJJv36Q/aHR0cDovL2NlcGguY29tL2RvY3M> for 'incomplete')
pg 2.6 is incomplete, acting [9,15,2147483647,4,2,2147483647,2147483647,2147483647] (reducing pool ecpool min_size from 6 may help; search ceph.com/docs <http://xo4t.mjt.lu/link/xo4t/g1qvzm1/15/FxzIS3EML7omyT-xj9U7Hw/aHR0cDovL2NlcGguY29tL2RvY3M> for 'incomplete')
pg 2.1 is incomplete, acting [2,4,2147483647,13,2147483647,10,2147483647,2147483647] (reducing pool ecpool min_size from 6 may help; search ceph.com/docs <http://xo4t.mjt.lu/link/xo4t/g1qvzm1/16/6lpzg5__RiyymLWzHdvKRA/aHR0cDovL2NlcGguY29tL2RvY3M> for 'incomplete')
pg 2.0 is incomplete, acting [14,1,2147483647,4,10,2147483647,2147483647,2147483647] (reducing pool ecpool min_size from 6 may help; search ceph.com/docs <http://xo4t.mjt.lu/link/xo4t/g1qvzm1/17/OTZs3Xz5bu9H1syUHd0fuQ/aHR0cDovL2NlcGguY29tL2RvY3M> for 'incomplete')
pg 2.3 is incomplete, acting [14,11,6,2147483647,2147483647,2147483647,2,2147483647] (reducing pool ecpool min_size from 6 may help; search ceph.com/docs <http://xo4t.mjt.lu/link/xo4t/g1qvzm1/18/_WIQbJVoxr6P2yjsFfRSsg/aHR0cDovL2NlcGguY29tL2RvY3M> for 'incomplete')
pg 2.2 is incomplete, acting [13,5,11,2147483647,2147483647,3,2147483647,2147483647] (reducing pool ecpool min_size from 6 may help; search ceph.com/docs <http://xo4t.mjt.lu/link/xo4t/g1qvzm1/19/4ZQ50hxXdfbf0qWDDd8w1A/aHR0cDovL2NlcGguY29tL2RvY3M> for 'incomplete')

At.

*Italo Santos*
http://xo4t.mjt.lu/link/xo4t/g1qvzm1/20/oOmOSjynRGJY6SEALa300g/aHR0cDovL2l0YWxvc2FudG9zLmNvbS5ici8/

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
http://xo4t.mjt.lu/link/xo4t/g1qvzm1/21/yzOn5cRjXOj4x_OPeorIaw/aHR0cDovL2xpc3RzLmNlcGguY29tL2xpc3RpbmZvLmNnaS9jZXBoLXVzZXJzLWNlcGguY29t

-- 
Loïc Dachary, Artisan Logiciel Libre

-- 
Loïc Dachary, Artisan Logiciel Libre

-- 
Loïc Dachary, Artisan Logiciel Libre

-- 
Loïc Dachary, Artisan Logiciel Libre

-- 
Loïc Dachary, Artisan Logiciel Libre

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com