Erasure Coding pool stuck at creation because of pre-existing crush ruleset ?

SCHAER Frederic <frederic.schaer@xxxxxx> · Wed, 30 Sep 2015 13:15:31 +0000

Hi,

With 5 hosts, I could successfully create pools with k=4 and m=1, with the failure domain being set to “host”.
With 6 hosts, I could also create k=4,m=1 EC pools.
But I suddenly failed with 6 hosts k=5 and m=1, or k=4,m=2 : the PGs were never created – I reused the pool name for my tests, this seems to matter, see below- ??

HEALTH_WARN 512 pgs stuck inactive; 512 pgs stuck unclean
pg 159.70 is stuck inactive since forever, current state creating, last acting []
pg 159.71 is stuck inactive since forever, current state creating, last acting []
pg 159.72 is stuck inactive since forever, current state creating, last acting []

The pool is like this :
[root@ceph0 ~]# ceph osd pool get testec erasure_code_profile
erasure_code_profile: erasurep4_2_host
[root@ceph0 ~]# ceph osd erasure-code-profile get erasurep4_2_host
directory=/usr/lib64/ceph/erasure-code
k=4
m=2
plugin=isa
ruleset-failure-domain=host

The PG list is like this – all PGs are alike- :
pg_stat objects mip     degr    misp    unf     bytes   log     disklog state   state_stamp     v       reported        up      up_primary     
 acting  acting_primary  last_scrub      scrub_stamp     last_deep_scrub deep_scrub_stamp
159.0   0       0       0       0       0       0       0       0       creating        0.000000        0'0     0:0     []      -1      []     
 -1      0'0     2015-09-30 14:41:01.219196      0'0     2015-09-30 14:41:01.219196
159.1   0       0       0       0       0       0       0       0       creating        0.000000        0'0     0:0     []      -1      []     
 -1      0'0     2015-09-30 14:41:01.219197      0'0     2015-09-30 14:41:01.219197

I can’t dump a PG (but if it’s on no OSD then…)
[root@ceph0 ~]# ceph pg 159.0 dump
^CError EINTR: problem getting command descriptions from pg.159.0
Ø 
Hangs.

The OSD tree is like this :
-1 21.71997 root default
-2  3.62000     host ceph4
  9  1.81000         osd.9           up  1.00000          1.00000
15  1.81000         osd.15          up  1.00000          1.00000
-3  3.62000     host ceph0
  5  1.81000         osd.5           up  1.00000          1.00000
11  1.81000         osd.11          up  1.00000          1.00000
-4  3.62000     host ceph1
  6  1.81000         osd.6           up  1.00000          1.00000
12  1.81000         osd.12          up  1.00000          1.00000
-5  3.62000     host ceph2
  7  1.81000         osd.7           up  1.00000          1.00000
13  1.81000         osd.13          up  1.00000          1.00000
-6  3.62000     host ceph3
  8  1.81000         osd.8           up  1.00000          1.00000
14  1.81000         osd.14          up  1.00000          1.00000
-13  3.62000     host ceph5
10  1.81000         osd.10          up  1.00000          1.00000
16  1.81000         osd.16          up  1.00000          1.00000

Then, I dumped the crush ruleset and noticed the “max_size=5”.
[root@ceph0 ~]# ceph osd pool get testec crush_ruleset
crush_ruleset: 1

[root@ceph0 ~]# ceph osd crush rule dump testec
{
    "rule_id": 1,
    "rule_name": "testec",
    "ruleset": 1,
    "type": 3,
    "min_size": 3,
    "max_size": 5,

I thought I should not care, since I’m not creating a replicated pool but…
I then deleted the pool + deleted the “testec” ruleset, re-created the pool and… boom, PGs started being created !?

Now, the ruleset looks like this :
[root@ceph0 ~]# ceph osd crush rule dump testec
{
    "rule_id": 1,
    "rule_name": "testec",
    "ruleset": 1,
    "type": 3,
    "min_size": 3,
    "max_size": 6,
               ^^^

Is this a bug, or a “feature” (if so, I’d be glad if someone could shed some light on it ?) ?
I’m presuming ceph is considering that an EC chunk is a replica, but I’m failing to understand the documentation : I did not select the crush ruleset when I created the pool.
Still, the ruleset was chosen by default (by CRUSH?) , and was not working… ?

Thanks && regards

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com