Erasure Coding pool stuck at creation because of pre-existing crush ruleset ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

 

With 5 hosts, I could successfully create pools with k=4 and m=1, with the failure domain being set to “host”.

With 6 hosts, I could also create k=4,m=1 EC pools.

But I suddenly failed with 6 hosts k=5 and m=1, or k=4,m=2 : the PGs were never created – I reused the pool name for my tests, this seems to matter, see below- ??

 

HEALTH_WARN 512 pgs stuck inactive; 512 pgs stuck unclean

pg 159.70 is stuck inactive since forever, current state creating, last acting []

pg 159.71 is stuck inactive since forever, current state creating, last acting []

pg 159.72 is stuck inactive since forever, current state creating, last acting []

 

The pool is like this :

[root@ceph0 ~]# ceph osd pool get testec erasure_code_profile

erasure_code_profile: erasurep4_2_host

[root@ceph0 ~]# ceph osd erasure-code-profile get erasurep4_2_host

directory=/usr/lib64/ceph/erasure-code

k=4

m=2

plugin=isa

ruleset-failure-domain=host

 

 

The PG list is like this – all PGs are alike- :

pg_stat objects mip     degr    misp    unf     bytes   log     disklog state   state_stamp     v       reported        up      up_primary      acting  acting_primary  last_scrub      scrub_stamp     last_deep_scrub deep_scrub_stamp

159.0   0       0       0       0       0       0       0       0       creating        0.000000        0'0     0:0     []      -1      []      -1      0'0     2015-09-30 14:41:01.219196      0'0     2015-09-30 14:41:01.219196

159.1   0       0       0       0       0       0       0       0       creating        0.000000        0'0     0:0     []      -1      []      -1      0'0     2015-09-30 14:41:01.219197      0'0     2015-09-30 14:41:01.219197

 

 

I can’t dump a PG (but if it’s on no OSD then…)

[root@ceph0 ~]# ceph pg 159.0 dump

^CError EINTR: problem getting command descriptions from pg.159.0

Ø  Hangs.

 

The OSD tree is like this :

-1 21.71997 root default

-2  3.62000     host ceph4

  9  1.81000         osd.9           up  1.00000          1.00000

15  1.81000         osd.15          up  1.00000          1.00000

-3  3.62000     host ceph0

  5  1.81000         osd.5           up  1.00000          1.00000

11  1.81000         osd.11          up  1.00000          1.00000

-4  3.62000     host ceph1

  6  1.81000         osd.6           up  1.00000          1.00000

12  1.81000         osd.12          up  1.00000          1.00000

-5  3.62000     host ceph2

  7  1.81000         osd.7           up  1.00000          1.00000

13  1.81000         osd.13          up  1.00000          1.00000

-6  3.62000     host ceph3

  8  1.81000         osd.8           up  1.00000          1.00000

14  1.81000         osd.14          up  1.00000          1.00000

-13  3.62000     host ceph5

10  1.81000         osd.10          up  1.00000          1.00000

16  1.81000         osd.16          up  1.00000          1.00000

 

 

Then, I dumped the crush ruleset and noticed the “max_size=5”.

[root@ceph0 ~]# ceph osd pool get testec crush_ruleset

crush_ruleset: 1

[root@ceph0 ~]# ceph osd crush rule dump testec

{

    "rule_id": 1,

    "rule_name": "testec",

    "ruleset": 1,

    "type": 3,

    "min_size": 3,

    "max_size": 5,

 

I thought I should not care, since I’m not creating a replicated pool but…

I then deleted the pool + deleted the “testec” ruleset, re-created the pool and… boom, PGs started being created !?

 

Now, the ruleset looks like this :

[root@ceph0 ~]# ceph osd crush rule dump testec

{

    "rule_id": 1,

    "rule_name": "testec",

    "ruleset": 1,

    "type": 3,

    "min_size": 3,

    "max_size": 6,

               ^^^

 

Is this a bug, or a “feature” (if so, I’d be glad if someone could shed some light on it ?) ?

I’m presuming ceph is considering that an EC chunk is a replica, but I’m failing to understand the documentation : I did not select the crush ruleset when I created the pool.

Still, the ruleset was chosen by default (by CRUSH?) , and was not working… ?

 

Thanks && regards

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux